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PAY UP TO 
72% LESS 
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INTELLIGENCE 


Dell; Microsoft? and AMD can help reduce the cost and complexity of 
Business Intelligence. Our systems are quick to install, easy to manage, 
and built with standard components that work with what you already 
use — all for up to 72% less per terabyte than the competition? 
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L Embarcadero® Change 
results in less than 60 minutes. 


Embarcadero products let you access your databases from a single 
application window — so you can manage multiple databases, across 
different DBMS platforms. No other products or native tools let you do 
that. So, why waste your time on anything else? 


Try it for FREE and see for yourself! We are so sure that you can 
achieve real benefits in real time that we're offering you the chance to 
try all of Embarcadero's products for FREE in a 14 day trial. Just visit 
www.embarcadero.com/challenge to download your copy now! 


Embarcadero Technologies is throwing down 

the gauntlet to all database administrators, 

developers and data modelers. We have 

compiled a number of common database 

problems and each task has an estimated time for completion 
— but we challenge you to do them even faster! 


To get started, simply download your evaluation software and 
click on the ‘db FEST - Take the Challenge’ icon to check out the 
step-by-step guide. Then simply record the time taken to complete 
each task, return the form to us and you will automatically receive 
a FREE Embarcadero ‘db FEST tour’ t-shirt – and the chance to 
win an even bigger prize! 
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—Derek Comingore 
Are the BI improvements in SQL Server 2008 compelling enough to make you want 
to upgrade? Find out what you'll get with the new release. 
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13 Temporal Histograms CI Hack Your Database looking for more SQL 
—Itzik Ben-Gan Before the Hackers ро Sever business intel- 
Dealing with the distribution of events over —Don Kiely ligence (Bl) езен, 
time can be tricky. Examine how you can Fortify the security of your SQL Server en- "PSs and techniques, we're 
break the problem into steps. And if han- vironment by using an assortment of tools Introducing Essential ВІ 
dling all of a task’s requirements at once is for locating servers, identifying security UPDATE, a free, biweekly 
too complicated, relax some requirements, best practices, cracking passwords, and find- ВІ email newsletter. Check 
solve a simpler form of the problem, then ing vulnerabilities. it out and let me know 
reintroduce the complexity layers. га what you think. бо to 

f JU In Search of Duplicate »wgmag.con/email/- 
Creating Dimensions Indexes on Your Tables dp SubscibeConfirmation 
in SSAS —Andrew J. Kelly „іт to automatically sign 
—Craig Utley Using this expert advice, identify duplicate YP for a free subscription. 
Help users analyze data quickly—learn or redundant indexes on the tables in your —Christan Humphries, 
how to build proper dimensions and add databases. Your Savvy Assistant 
hierarchies to support users’ navigational Se 
and analysis needs. n ) T-SQL 101, Lesson 3 

и . — ВИ McEvo 

-U Data Warehousing: SELECT Шай just for retrieving 
Junk Dimensions data. You can also summarize that data by 
—Michelle A. Poolet incorporating COUNT, MIN, MAX, AVG, 

Keep your data warehouse design simple by and SUM functions into SELECT queries. 


placing miscellaneous attributes into junk 
dimensions. 
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11 Tool Time: 
SSMS Tools Pack 
—Kevin Kline 
The SSMS Tools Pack provides features, such as a query logging tool, that enhance 
SQL Server Management Studio s functionality. 


PRODUCTS IN EVERY ISSUE 


41 Comparative Review: 1 Editorial: 
SQL Server Change What If You Had a 
Management Tools Benchmark and Nobody 


—John Green Came? 

These solutions help protect your . .. —Michael Otey  — 
SQL Server environment from un- 8 
wanted modification by registering 
changes to data, server configura- 
tions, and schemas. Find out which 
one you can use to roll back changes 
or to migrate databases to a differ- 
ent platform. 
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New Products 


Check out new and improved 

SQL Server-related products from 
LogRhythm, Embarcadero, Vertica, 
and Markus Gruber. 
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Data Conversion 


Check out Altova MapForce? 2008 - the award-winning graphical data mapping tool from the creator 


of XMLSpy®. Drag & drop to map, convert, and transform data between: 


3 


Databases 


Web Services 


Flat Files 


mapforce® 


® Drag & drop data mapping & conversion 

• Support for all major relational databases 

* Database query window with SQL editor 

* FlexText utility for parsing flat files 

• Support for EDIFACT and X12 EDI messages 

* Integration with Microsoft? Visual Studio® 2005 
& 2008 


* Connecting data to Web services 

* Auto-generation of XSLT 1.0 and 2.0, XQuery, Java, 
C#, or C++ for royalty-free use 

* Drag & drop Web services creation 

* Extensible function library for filtering / processing data 

* Visual function builder for custom functions 


* Instant data conversion & output window 


Once you have defined a data 


mapping in MapForce, simply click insumos 
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the Output Window to convert data 
instantly. Or, generate royalty free 
code — and deploy it with no addi- 
tional fees or deployment adaptors 
required. With MapForce, you can 
implement data integration and Web 
services applications without writing 
any code! 
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Test drive MapForce for yourself — Download a free, 30-day trial at www.altova.com 


What If You Had a Benchmark 
and Nobody Came? 


At the SQL Server 2008 launch event in Los 
Angeles, Frangois Ajenstat, a director of 
product management for SQL Server, showed me 
some of the impressive benchmark numbers that 
were generated by the new release. Benchmark test- 
ing is pretty much obligatory for a new database 
release because it enables customers to compare 
the new release to other database platforms. Let’s 
take a look at some of SQL Server 2008’s bench- 
mark scores. 

As loyal SQL Server customers have come to 
expect, SQL Server 2008 running on the latest 
high-powered x64 multi-core hardware set several 
new benchmark records. Most notably, SQL Serv- 
er 2008 running on Windows Server 2008 recorded 
Microsoft’s first-ever result in the 10ТВ category 
of the TPC-H decision-support benchmark with a 
score of 63,000 Query-per-Hour (QphH) running 
on a 32-proc (64-core) HP Superdome Itanium 
server. The TPC-H benchmark is very demanding, 
and only six other TPC-H scores have been record- 
ed in 10 years. 

Other significant SQL Server 2008 benchmark 
scores include a world-record SAP Sales and Dis- 
tribution (SD) Benchmark score for 4-Socket In- 
dustry Standard Blade servers in a three-tier test. 
In conjunction with Unisys, Microsoft and SQL 
Server 2008 also set a new standard for extrac- 
tion, transformation, and loading (ETL) perfor- 
mance by loading 1TB of data in less than 30 
minutes. 

In addition, Microsoft posted a new set of 
TPC-E results for SQL Server 2008. SQL Server 
2008’s TPC-E results include a score of 1126 tpsE 
on an Itanium 32-proc (64-core) server. This result 
is the first TPC-E result on a 64-way server and the 
first score of more than 1000 tpsE—beating the 
previous high score by 70 percent. SQL Server also 
set a new four-socket server high of 479 tpsE on a 
Xeon 4-proc (16-core) server, which is a 14 percent 
performance gain over SQL Server 2005 and Win- 
dows Server 2003. 

However, as François shared SQL Server 2008's 
TPC-E benchmark scores with me, I couldn’t help 
but notice that Microsoft is the only database ven- 
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dor participating in the TPC-E benchmark. See- 
ing as the purpose behind the TPC benchmarks is 
to provide a tool that can be used to fairly com- 
pare competing database and hardware platforms, 
it's a bit anticlimactic when the results include 
the scores of only a single database vendor. It's 
true that the TPC-E benchmark is fairly new; it 
was first introduced in March of 2007. However, 
since that time, Microsoft has been the only data- 
base vendor to post TPC-E scores. If you take a 
look at the Transaction Processing Performance 
Council's Web site at www.tpc.org, you'll see that 
Oracle and IBM are continuing to submit TPC-C 
scores. 

So what's the purpose of the TPC-E bench- 
mark? The idea behind the new database bench- 
mark was to more accurately reflect the workloads 
of today’s applications. The TPC-C benchmark 
was designed more than 15 years ago and is based 
on an order-entry-shipment model. One clear 
indicator of the TPC-C benchmark’s age is its 
primary measurement: transactions per minute 
(tpm). Fifteen years ago, a tpm measurement was 
reasonable. However, today’s high-powered sys- 
tems are capable of much more. Now it’s more 
valid to measure transactions per second (tps)— 
the way it’s done in the TPC-E benchmark. The 
TPC-E benchmark is designed to reflect the 
workload of a financial brokerage firm and de- 
fines a required mix of sample transactions such 
as trades, account inquiries, and market research. 
The benchmark itself is scalable based on the 
number of customers defined to represent small, 
medium, and large businesses. 

Although the database world could use an up- 
dated benchmark, a benchmark without industry 
support is of limited value. More TPC-E bench- 
mark scores won't do Microsoft or its customers 
any good until one of the other major database 
vendors releases TPC-E benchmark scores. Un- 
til then, Microsoft, its partners, and its custom- 
ers would be better served by continuing to post 
TPC-C scores, which enable customers to compare 
SQL Server, Oracle, and DB2. En 
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Michael Otey 

(mikeo @ windowsitpro.com) is technical 
director for Windows IT Pro and SQL Server 
Magazine and coauthor of SQL Server 2005 
Developers Guide (Osborne/McGraw-Hill). 
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LISTING |: Code You 


Can Customize in 
sp_ShowBackups 


CREATE PROCEDURE sp_ShowBackups 
( 


adays smallint = 1, 
@dbname sysname = '7' 
D 


Q Editors Tip 


Share your SQL Server 
code, comments, discov- 
eries, and solutions to 
problems. Email your 
contributions to r2r(Q) 
sqlmag.com. Please include 
your full name and phone 
number. We edit submis- 
sions for style, grammar, 
and length. If we print 
your submission, you'll 
get $100. 

—Karen Bemowski, 

senior editor 


Figure | 


The BACKUP SUMMARY 
section 


8 May 2008 


ORE on the WEB 


Download the code at 
InstantDoc ID 98570. 


Take the Drudgery Out of Making Sure Your 
Databases Are Being Backed Up Properly 


A Eric Peterson noted in his Reader to Reader 
$ article, “Keep Track of Your Backups” 
(September 2007, InstantDoc ID 96264), one of the 
most important tasks of a DBA is to perform backups. 
However, picking through event logs to ensure the 
databases are being backed up properly can be quite 
time-consuming, especially for DBAs who 
manage multiple servers and databases. Like 
Eric, I wrote some code, sp_ShowBackups, to 
take the drudgery out of examining event logs 
and backup jobs. This store procedure 
generates an easy-to-read report that 
details all the various database backups 
that have occurred on a server within the 
specified number of days. 

The sp_ShowBackups stored pro- 
cedure relies on the vBackupHistory 
view, which needs to be created in the 
msdb database. This view joins several 
backup-related tables together and can 

be quite useful on its own. You can download sp 
ShowBackups, vBackupHistory, and a sample report 
(sp ShowBackups SampleOutput.txt) by going to www 
-sqlmag.com, entering 98570 in the InstantDoc ID text 
box, clicking Go, then clicking the 98570.zip hotlink. 
By default, sp ShowBackups is set up to obtain 
the backup information for the past day on all the 
databases on the server on which you run the stored 
procedure. To specify a different number of days, you 
can change the @days parameter's value from 1 to the 
desired number of days. (See Listing 1.) If you want to 
obtain the backup information for only one database 
rather than all the databases on the server, you can 


change the @dbname parameter's value from % to the 
database's name. 

The report produced by sp ShowBackups has five 
sections. The first section is “BACK UP SUMMARY,” 
which consists of two components, as Figure | shows. 
One component lists the type and number of backups 
that have occurred on each database since that data- 
base was created (assuming the database’s backup 
history hasn’t been deleted from the system tables). 
The other component uses YES and NO indicators 
to reveal whether or not a particular type of backup 
has occurred on each database within the specified 
time period. Vigilant DBAs can look at either com- 
ponent to determine whether all the backups from 
the previous day completed successfully. However, I 
prefer the second component because I’ve found that 
the visual presentation of the YES and NO indicators 
lets me quickly detect problems with just a cursory 
glance. 

The next three sections—“FULL DATABASE 
BACKUPS,” “INCREMENTAL BACKUPS,” and 
“TRANSACTION LOG BACKUPS (100 max)"— 
provide detailed information for full backups, incre- 
mental backups, and transaction log backups. The 
details include the size of the backup, the duration 
(HH:MM:SS), start time, finish time, and the user who 
initiated the backup. The meticulous DBA can use this 
information to quickly and easily find out the size and 
duration of the backups. The security-conscious DBA 
can also use this portion of the report to look for any 
unauthorized backups. 

The final section is “BACKUP THROUGHPUT 
(MB/s),” which specifies how many megabytes per 

second the backup processes are 


Backups since: 2008-03-29 21:47:53 


== BACKUP SUMMARY == 


Database Type Backups 
CookingwithsQL DB 2 
CookingwithsQL Loc 10 
MyTinyStats DB 1 
SQLHunter ов 2 
sSQLHunter LOG 12 
VeritatemQuaere DB 3 
VeritatemQuaere INCR 2 

Database 

CookingwithsoL YES: 2 NO 
master NO NO 
msdb NO NO 
MyDB NO NO 
MyTinyStats YES: 1 NO 
SQLHunter YES: 2 NO 
VeritatemQuaere YES: 3 YES: 2 


achieving. This data is broken 
down by backup type and 
month. This section is mainly for 
DBAs who want to brag about 
how fast their I/O subsystem is. 
I originally wrote sp Show- 
Backups for SQL Server 2000. 
However, it works well on SQL 
Server 2005, which is a nice 


v 39 surprise. SQL] 
No —Bill McEvoy, master chef/ 
Es: 12 DBA, Cooking with SQL 
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"In some projects we have five simultaneous 

database environments to keep in synch. 

| downloaded the from Red Gate and the 

next 14 days were the best SQL days in my life." 

Kaupthing Bank 

i and "| am literally staggered at 
synched how much more efficient 

30,000 product lines and it is to perform simple 

the 12 associated tables data transportation with 

in less than 20 minutes . In 

-atask that takes some cases, it's up to 10 

the best part of times faster with SQL Data 

a day manually!!! and Compare than it would be 


Go try it and buy it are the best purchases [| with DTS." 
NOW У" [ we've made in tne .NET/SQL — 


environment...We rely on these 
products for every deployment." 
Technical Lead, Universal Music Group 


Boretec IT Ltd 


"Simple to use, powerful, fast - awesome. 
| have never written a testimonial before, but | 
felt compelled to do so in this case." 


"There are times that І love this 
business - this would be one 
of them. Because without 

, and given 700 rows 
in sysobjects plus 13,000 rows 
in syscolumns in the production 
db, it would've taken me hours 
to find...Now, what to do with 
the rest of my day :-)" 


Database Analyst/Consultant 


Technical Director, Information Technology, Walt Disney Studios 


" is a must-have 
tool for all T-SQL developers." 


Brian Online 


"| think is an indispensable tool... | 
wish I'd had it several months ago." 
MVP, Solid Quality Learning 
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Acronis Retovery 
и ^ 


Everything should have automated 
recovery to point-of-failure. 


Acronis? Recovery" For MS SQL Server 


Acronis Recovery for MS SQL Server offers a fast and 
reliable disaster recovery solution to protect your 
SQL database. Acronis Recovery for MS SQL Server 
uses proven database backup technology that will 
drastically reduce disaster recovery time so you 


can be running again in minutes instead of hours. 


Acre 


WHAT’S INSIDE: 


y Automated Recovery to Point-of-Failure 
У FTP Support 
У Disaster Recovery Plan 


V And More! 


4) Acronis 


COMPUTE WITH CONFIDENCE 


WWW.ACRONIS.COM 


ACRONIS, INC. | 23 38° AVENUE, BURLINGTON, MA 01803 | PHONE: (781) 222-0920 


SSMS Tools Pack: 6 Tools That 


Help You Do More with SSMS 
Bridge the gaps іп SSMS’s functionality 


evelopers are constantly looking for ways to 

make SQL Server Management Studio (SSMS) 
and Visual Studio (VS) easier and more efficient to use. 
One such developer, Mladen Prajdic, is a SQL Server 
Microsoft Certified Professional (MCP) and Micro- 
soft Certified Technology Specialist (MCTS) for SQL 
Server 2005. He created the SSMS Tools Pack as part 
of his daily work with SQL Server and .NET in C#. 
The SSMS Tools Pack is a set of plug-ins that enhances 
SSMS's functionality and bridges the gap between 
SSMS and VS. These plug-ins provide developers with 
features that aren’t otherwise available in SSMS. The 
SSMS Tools Pack includes the following tools: 
* CRUD Procedure Generation—Many SQL Server 
experts advocate that users shouldn’t directly access 
tables for data manipulation. Instead, they encourage 
developers to build applications in which users access 
either views or stored procedures to perform data 
manipulation tasks. This tool generates stored pro- 
cedures that perform all of your create, read, update, 
and delete (CRUD) operations. You can also use this 
tool to insert templates into T-SQL code and fully 
customize templates to meet any other needs you 
might have, such as updating a table’s statistics or 
checking its size. 
Generate INSERT Statements—Sometimes it’s useful 
to create an INSERT statement for each row of data 
in a table because then you can use the script to insert 
new records into other SQL Server instances without 
having to use replication or DTS. This tool generates 
an INSERT statement for each row of data in the 
database starting with tables that have primary keys 
and no foreign keys, and then going in dependency 
order. Binary data and large data types, such as 


THE SSMS TOOLS PACK 
BENEFITS: The SSMS Tools Pack provides develop- 
ers with tools that enhance SSMS's functionality. 


SYSTEM REQUIREMENTS AND NOTES: SQL 
Server 2005 or SQL Server Express; SSMS or 
SSMS Express 


HOW TO GET IT: You can download the SSMS Tools 
Pack from www.ssmstoolspack.com. 
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image and text columns, can also be scripted if they 
contain fewer than 3MB of data in the large object 
(LOB) column. When adding scripts to a datagrid, 
the resulting INSERT statements are placed into a 
temporary table for later use. 

Query History— This tool logs every statement that 
you run in SSMS to a file on the local disk or to a 
table in the specified database, making it easy to recall 
frequently used statements or to track all the changes 
made to source code between the check-out time and 
check-in time of the source T-SQL code. 

Query Template—The query template tool enables 
certain code, such as a comment block or a standard 
block of T-SQL exception-handling code, to appear 
whenever you start a new query. This tool can save 
developers a lot of time if your organization has 
certain coding standards that all developers must 
adhere to. 

Search—Often times, the particular occurrence of 
the search string 
youre looking 
for will appear 


Kevin Kline 


kevin.kline @ quest.com) is the director of 


technology for SQL Server Solutions at Quest 
Software and a founding board member of 


the international Professional Association for 
SQL Server. He is the author of SQL in a 
Nutshell, 2nd edition (O'Reilly Media, 2004). 


+ _) Add Debug Section # Execute y 


in different areas SELECT 

of the graphical FROM Person. Address 
execution plan #-Begindebug 
window, espe- ian Mie 

FROM Person. Address 
cially when the #-EndDebug 
execution plan is = , 

. --fregion This is my description for а region 
very large. This SELECT 
FRON Person.ContactType 

toolletsyoufind |... ss. 

all occurrences 

Figure | 


of a given search string in an execution plan(s) or in 
the results that are returned in the datagrid. 

Text Regions— Text regions enhance the usability of 
SSMS and increase your productivity as a T-SQL 
programmer by enabling you to expand or collapse 
large regions of T-SQL code in SSMS. Figure 1 shows 
how you can expand text regions by clicking the [+] 
symbol or collapse them by clicking the [-] symbol. 


Collapsing and expanding 
text regions in SSMS 


The SSMS Tools Pack is available for SSMS and 
SSMS Express. Because the SSMS Tools Pack is an 
add-in to SSMS and SSMS Express, you must have 
SSMS installed. 500 
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COMPLETE YOUR DATA 
VISUALIZATION AND 
REPORTING TOOLBOX 


-Advertisement- 


Being a developer is about making 
things work, pulling together require- 
ments and architecting solutions. This 
can mean tying components together 
to access and present data, complet- 
ing business functions, or integrating 
disparate systems. For many, creating 
compelling visual models and engag- 
ing reports is a tedious process that 
only makes the job that much harder. 
It might not be so bad if end user 
requirements were static, but unfor- 
tunately, that’s not always the case. 
Successful application developers 
gain that status, in part, because they 
are able to answer the problem of 
data presentation and the challenge 
of changing user requirements. 


For more than a decade Business 
Objects, an SAP company, has 
been helping developers solve these 
problems. Now they are making data 
presentation solutions more acces- 
sible by providing one of the world’s 
most popular reporting tools and 
their latest dashboard technology 
bundled together at one low price. 
The Crystal Reports® 2008 Visual 


Xcelsius 2008 is a Flash-based tool that 


enables developers and users of different 


skill levels to create rich, engaging dash- 
boards within a point-and-click design 
environment. With ready to use visual 
components, gauges and maps develop- 
ers can quickly create dashboards that 
combine data from multiple data sources 
with scenario analysis and what-if model- 
ing. The final dashboard is compiled into 


For over 15 years developers have 
been embedding Crystal Reports 
into web and windows applications 
creating highly formatted reports 
from virtually any data source. Crystal 
Reports 2008 makes reports easier 
to consume and generate. With new 
interactivity features like report re- 
formatting and on-report sorting and 
filtering, one report can now serve 


Advantage bundle includes: Crystal an SWF file which can 
Reports 2008, a standard reporting 
solution for many organizations, and 
Xcelsius® 2008, a leading point and 
click dashboard tool, at a suggested 
list price of $995 USD (a savings of 


almost $400). 


portals. 


Crystal Reports and Xcelsius Comparison Table 
Basic Function 
Create dashboards and publish as interactive SWF Flash files 


.NET and Java™ applications or integrated 
into Microsoft Office, PDF, and corporate 


then be called from many individuals and answer multiple 
questions. With the ability to now 
embed multi-media Flash™ and Flex™, 
application developers can create 
mash-ups between Xcelsius and 
Crystal Reports that make the data 
more consumable and understand- 


able for the user. 


Crystal 


Reports Xcelsius 


Embedded Xcelsius models can 


Create relational reports and publish as report RPT files 
Design 
Quickly add visual components to your canvas in drag-n-drop design environment 


also leverage Crystal Reports data 
connectivity using the report as a 
data source and dynamically update 


Display data using rich charts, maps, tables and graphs 


Interact with data using what-if analysis and scenario modeling 
Data Connectivity 
Access data via ODBC, JDBC, XML, Web Services or native SQL and PC drivers 


the dashboard model. This enables 
Crystal Reports users to look beyond 
today’s data and analyze what the 
future could bring. 


Access data via Web Services, XML, Crystal Reports or Excel 
Deployment 
Retain formatting when exporting to HTML, Excel, PDF, Word, PPT and more 


Find out more about the Crystal Re- 
ports 2008 Visual Advantage bundle 


Click once to embed interactive Flash dashboards into PDF, Word, PPT 


Deploy or distribute with your application to unlimited CPU's and Servers 


with Crystal Reports and Xcelsius, at: 


Publish to BusinessObjects framework for secure, managed information delivery 


www.businessobjects.com/CRvis 


Use comprehensive .NET and Java" SDK's to integrate report engine 


or call: 1-888-333-6007. 


Embed Flash dashboards with runtime control via FlashVars 


* Unlimited internal corporate deployment of Crystal Reports component engines 
** Flash out put with no license restrictions for distribution 


© 2008 Business Objects. Business Objects and the Business Objects logo, BusinessObjects, Crystal 
countries of Business Objects and/or affiliated companies. All other names mentioned herein may be trad 
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; Objects 


an SAP* company 


Reports and Xcelsius are trademarks or registered trademarks in the United States and/or other 
lemarks of their respective owners. 


Show the distribution of events over time 


ecently I received a request from a customer 

R to come up with a solution that produces 

temporal histograms—histograms showing 

the distribution of events over time periods. The 
problem was an interesting challenge and seemed like 
a generic need, so I decided to cover it in my column. 
First, ГЇЇ explain the problem in terms of inputs and 
desired output. Next, lll provide a solution that 
handles only a specific case of the problem. Then, ГЇЇ 


show you how 
ORE on the WEB E 


to enhance the 
Download the listings and see 


solution to make 
the Web tables at InstantDoc — . “ 
10 98360. И more generic. 


The Challenge 
Suppose that you have a table called Events in your 
database, containing information about events in 
time. These events can be appointments, sessions, or 
anything that has start and end points in time. The 
Events table has three columns: event_id is the primary 
key, event_start is the start point in time of the event, 
and event_end is the end point. Run the code in Web 
Listing 1 (www.sqimag.com, InstantDoc ID 98360) 
to create the Events table in the tempdb database and 
populate it with sample data. 

You need to write a table function that accepts the 
following inputs: 

@from_dt—start point of a datetime range 

@to_dt—end point of a datetime range 

@date_part—a datetime part from the enu- 

meration: ‘minute’, ‘hour’, ‘day’, ‘week’, ‘month’, 

‘quarter’, ‘year’ 

@num_parts—the number of datetime parts to be 

covered in each step 


The table function should produce a histogram 
showing the number of active events during each fixed 
interval of time within the requested datetime range. 
The intervals of time, or steps, are based on the input 
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(Qdate part and (num parts. 

The problem is best explained through an example. 
Given the inputs @from_dt = ‘20080501 00:00’, @ 
to_dt = ‘20080511 00:00’, @date_part = ‘day’, @ 
num, parts = 1, you're supposed to produce the output 
in Web Table 1. Each row in the output represents a 
different fixed interval of time (from_dt - to_dt) within 
the requested datetime range. Because the requested 
date part is ‘day’, and the number of parts is 1, each 
step in the histogram represents one day. The fourth 
column in the output (num_events) holds the count of 
events from the Events table that were active during the 
current interval. 

Regarding the histogram step boundary points, one 
of the requirements from my customer was to produce 
round points in time (round in respect to the input @ 
datepart), except for the extreme boundary points that 
must be the ones provided by the user. For example, 
given the inputs @from_dt = ‘20080501 12:30’, @to_dt 
= ‘20080510 10:00’, @date_part = ‘day’, @num_parts 
= 1, the step boundary points should be those in Web 
Table 2. Notice that the first step’s low boundary point 
is 2008-05-01 12:30:00.000 and the last step’s high 
boundary point is 2008-05-10 10:00:00.000, whereas 
all the other step boundary points represent whole days 
(in terms of day units). 


Producing the Histogram Steps 
You can start by creating a table function (call it 
fn_HistSteps) that returns the histogram steps table 
based on the previously mentioned input parameters. 
The function will return a row for each step with the 
step boundary points. Once defined, you can join the 
function with the Events table to match steps and 
events, group the result of the join by step, and count 
the number of events in each step. 

So that you don’t have to deal with too many 
aspects of the problem at once, you can first relax some 
of the requirements. For example, take the @date_part 
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and @num_parts inputs out of the equation, and solve 
the task for a specific interval—say, one day. After you 
manage to solve the problem for a specific interval, you 
can add the logic required to handle the requested @ 
date_part and @num_parts inputs. 

In my solution I used an auxiliary table of numbers 
that you create and populate by running the code in 
Web Listing 2. This code creates a table called Nums 
with a single column called n, and populates the table 
with integers in the range | through 1,000,000. 

To create the first version of the fn_HistSteps func- 
tion, run the code in Web Listing 3. The function is an 
inline table-valued function based on a single query with 


Dealing with 
temporal 
data can be 
quite tricky. 


multiple common table expressions (CTEs). The first 
CTE defined by the function’s code is called CO and it 
has two columns: floor_from_dt and diff. The former is a 
floor of the input @from_dt value in terms of day units; 
that is, midnight of the input (from dt value. The latter 
is the number of days in the range @from_dt - (ato dt. 
The second CTE is called СІ; it's in charge of pro- 
ducing steps with round boundary points. This task is 
achieved by joining Nums and Steps, and returning all 
n values that are smaller than or equal to diff (number 
of days in the input range). The starting point of each 
step (from dt) is calculated by adding n-1 days to 
floor from dt, and the ending point of each step (to_ 
dt) is calculated by adding n days to floor from dt. 
The third CTE is called C2; it's in charge of 
adjusting the extreme boundary points (start point of 
first step and end point of last step) if they need adjust- 
ment. Remember that the previous CTE (C1) produced 
round boundary points, although the requirement 
was that the extreme boundary points would be those 
provided by the user as the input datetime range 
boundary points. Note that as a result of adjusting the 
extreme boundary points, C2 might end up with rows 
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representing irrational steps where to dt isn't greater 
than from dt. Those rows will be eliminated by the 
outer query. The outer query simply returns all rows 
from C2 representing the histogram steps, excluding 
the irrational steps produced by a previous CTE. 

To test the function, run the following code: 


SELECT * 

FROM dbo.fn HistSteps('20080501 
00:00', '20080511 00:00') AS S 

ORDER BY n; 


You should get the steps shown in Web Table 1, 
without the num events column. 

To test the function with nonround range boundary 
points, query it with the following inputs: 


SELECT * 

FROM dbo.fn_HistSteps('20080501 
12:30', '20080510 10:00') AS S 

ORDER BY n; 


You should get the steps shown in Web Table 2; again, 
excluding the num events column. 

Now that your function works for one-day inter- 
vals, you can add logic to support a requested date 
part (date part) and number of parts (num parts). 
The revision to the function isn't complicated. You 
need to substitute all expressions that currently use 
the date part day with a CASE expression that uses 
the requested date part. Also, when calculating diff, 
you'll to need divide the value by @num_parts as part 
of the expression. To create the revised fn. HistSteps 
function, run the code in Web Listing 4. Note that if 
the function is invoked with an unrecognized part, the 
CASE expressions will default to ELSE NULL, the 
query filter will filter out all rows, and the function will 
return an empty set. 

Now you can specify the date part and the number 
of parts as inputs. For example, to get a steps table 
for the range ‘20080501 00:00' - ‘20080502 00:00", 
with four-hour intervals, you'd query the function as 
follows: 


SELECT * 

FROM dbo.fn HistSteps('20080501 
00:00', '20080502 00:00', 'hour', 
4) AS S 

ORDER BY n; 


Producing the Actual 
Histogram 

Most of the work is now behind you; what's left is to 
join the fn. HistSteps function with the Events table 
to match steps and events, group the results by step, 
and return the count of active events in each step. To 
check whether an event (starting at event start and 
ending at event end) overlaps with a step (starting 
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THE LOGICAL PUZZLE 


Solution to April’s Puzzle: 
Suicidal Mosquito 
Two trains drive toward each other on the same rail. 
Both trains drive at a speed of LOOMPH. When the 
trains are 100 miles apart, a mosquito starts flying 
back and forth from the front of one train to the 
other at a speed of 200MPH. What total distance 
will the mosquito cover before the two trains crash? 
For some people the puzzle might seem to 
require infinity calculations. Although solving the 
puzzle this way is possible, such a solution is 
unnecessarily complicated. The simplest way to 
think about the puzzle is to consider only duration 
and speed. The time it takes until the trains crash 
is half an hour, and the speed of the mosquito is 
200MPH. With such speed and duration, the mos- 
quito would cover 100 miles. 


May's Puzzle: A Cat, a String, 

and the Earth 

This month's puzzle is quite simple, but | like it 
because it's so counter intuitive. Suppose you lay a 
string on the ground all around the earth right over 
the equator. The length of the string would be equal 
to the earth's equatorial circumference—40,075.02 
kilometers. Then, suppose you add 1 meter to the 
string, and suspend the string directly above the 
equator, with an even distance from the ground all 
the way around. Would a cat be able to pass from 
one hemisphere to another below the string? 


InstantDoc ID 98361 


at from, dt and ending at to. dt), you can use the fol- 
lowing predicate: 


event start « to dt AND event, end » 
from dt 


Note that you can use <= instead of <, and >= instead 
of > depending on how you want to treat the boundary 
point itself (inclusive versus exclusive). 

Here's the full query that would give you a daily 
histogram for the range ‘20080501 00:00" - ‘20080511 
00:00’, producing the output in Web Table 1: 


SELECT n, from_dt, to_dt, 
COUNTCevent_id) AS num events 
FROM dbo.fn HistSteps('20080501 
00:00', '20080511 00:00', 'day', 
1) AS S 
LEFT OUTER JOIN dbo.Events AS E 
ON E.event start « S.to dt 
AND E.event end » S.from dt 
GROUP BY n, from dt, to dt 
ORDER BY n; 


Notice that an OUTER JOIN is used here instead of 


an INNER JOIN in order to return empty steps (steps/ 
intervals with a пит events value of 0) as well. 
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The following query returns a daily histogram for 
the range ‘20080501 12:30” - ‘20080510 10:00’, pro- 
ducing the output in Web Table 2: 


SELECT from dt, to dt, COUNTCevent_ 
id) AS num events 
FROM dbo.fn_HistSteps('20080501 
12:30', '20080510 10:00', 'day', 
1) AS S 
LEFT OUTER JOIN dbo.Events AS E 
ON E.event start « S.to dt 
AND E.event end > S.from dt 
GROUP BY from dt, to dt 
ORDER BY from dt; 


And finally, the following query returns a four- 
hour step histogram for the range ‘20080501 00:00' 
- ‘20080502 00:00’, producing the output in Web 
Table 3: 


SELECT n, from dt, to dt, 
COUNT(Cevent id) AS num events 
FROM dbo.fn HistSteps(C'20080501 
00:00', '20080502 00:00', 'hour', 
4) AS S 
LEFT OUTER JOIN dbo.Events AS E 
ON E.event start « S.to dt 
AND E.event end > S.from dt 
GROUP BY n, from dt, to dt 
ORDER BY n; 


A useful 
approach is 

to relax some 
of the requirements, 


solve a simpler form of the 
problem, then reintroduce 


the complexity layers. 


One Step at a Time 
Dealing with temporal data can be quite tricky. In fact, 
Гуе written extensively about datetime manipulation 
(see the Learning Path). When you face such challenges 
in which the solution isn't trivial, it's important to 
break the problem into steps as I’ve done in this article. 
In addition, when handling all of a task's requirements 
at once is too complicated, a useful approach is to relax 
some of the requirements, solve a simpler form of the 
problem, then reintroduce the complexity layers. Ef 
InstantDoc ID 98360 
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Server 2008, while not a revo- 

lutionary release, provides 

rich insight into your data 
for your organization's business intelligence (BI) needs. 
ГІ take you on a brief tour of SQL Server Integration 
Services (SSIS), SOL Server Analysis Services (SSAS), 
and SQL Server Reporting Services (SSRS) enhance- 
ments, so you're armed with information to help you 
make decisions about implementing SQL Server 2008 
BI solutions. First, I want to point out two relational 
engine (query optimizer) enhancements that set the 
stage for better BI—partitioned table parallelism and 
star-join query optimizations (Optimized Bitmap Fil- 
ters). Even though this tour is just a taste of the many 
enhanced BI features found in SQL Server 2008, it 
should be enough to give you food for thought when 
you consider whether upgrading to SQL Server 2008 
ВІ is a smart move, based on your organization's BI 
architecture and requirements. 


Partitioned Table Parallelism 
SQL Server 2008 improves performance on parti- 
tioned tables that reside on multi-CPU-based sys- 
tems. The query optimizer can elect a parallel query 
execution plan on these hard systems to provide im- 
proved performing query and index operations. Fact 
tables are often candidates for partitioning in a data 
warehouse because they typically contain a few col- 
umns with a very large number of records. 

Does your data warehouse have large fact tables 
residing on multiple CPUs? You can benefit by up- 
grading to SQL Server 2008 because there's a new 
parallel query execution strategy on partitioned 
tables. SQL Server 2005 uses a single thread per parti- 
tion parallel query execution strategy. In SQL Server 
2008, multiple threads can be allocated to a single 
partition, thus improving the query's response time. 
As of this writing, you can enable this functionality 
by setting the trace flag 2440, although this is expect- 


ed to change when the product ships. Note that table 
and index partitioning requires SQL Server 2008 En- 
terprise Edition. 


Star-Join Query Optimizations 
(Optimized Bitmap Filters) 

The query optimizer uses bitmap filtering to eliminate 
rows from a second table based on values taken from 
the first table. Bitmap filtering is a common query 
filtering technique found in star-schema-based que- 
ries. SQL Server 2008 introduces optimized bitmap 
filtering. The query optimizer can now introduce 
bitmap filters dynamically in the query plan during 
generation, as opposed to just after query plan opti- 
mization, as in SQL Server 2005. Optimized bitmap 
filtering results in filtering from multiple dimension 
tables and bitmap filters are now applicable to more 
query operator types. Optimized bitmap filtering en- 
ables better performing data-aware house queries that 
reference the common star-based schemas. 


SSIS Enhancements 

Any BI solution includes extraction, transformation, 
and loading (ETL) of an organization's data. ETL is 
implemented in SQL Server using SSIS. In SQL Serv- 
er 2005, the SSIS pipeline execution engine doesn't 
scale up to utilize more than one processor in a single 
execution tree. The SQL Server 2008 SSIS data flow 
engine can execute multiple components (threads) in 
a single execution tree. Overall, the 2008 SSIS engine 
is more stable and scalable. It eliminates the potential 
for deadlocks that occasionally occur in SQL Server 
2005 SSIS when you execute packages with complex 
user data in large organizations. 

Lookup transformation. SSIS can be used in a va- 
riety of scenarios, however, it's most commonly used 
in ETL. One of the most common SSIS components 
used in ETL solutions is the Lookup transformation. 
The SQL Server 2005 SSIS lookup component used 
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hen building a business IT 
environment, using the 
most efficient, cost-effective 
tools available is a critical 
factor. This reality extends 
to creating processes 

for reporting and analyzing business data 
throughout your company. Using the best 
tools optimizes the reporting process and 
maximizes the benefits at each step of the 
business intelligence (BI) workflow. 


Microsoft BI tools such as Microsoft SQL 
Server and SQL Server Reporting Services 
(SSRS) are commonly used in conjunction with 
Microsoft Office Excel to store data, create 
reports, and display data for end users. In 
many cases, these tools are expected to solve 
problems they were not specifically designed 
to solve. These older, more familiar reporting 
technologies simply can’t address advanced 
needs for data understanding. 


However, there are other tools in the 
Microsoft BI stack that provide “one-stop 
shopping” to make BI more streamlined. 


This article reviews frequently used 

Microsoft tool sets, then introduces you to 
Microsoft SQL Analysis Services — a powerful, 
economical, and efficient way to format, 
display, and interpret your company’s business 
intelligence data. 


Microsoft Business 

Intelligence Stack 

Microsoft BI software is designed to analyze 
and present data in a way that helps decision 
makers understand information and make 
decisions quickly. At their best, these tools can 
answer “why” and “how” questions behind 
the numbers presented within a report. 


Many companies look for a solution to their 
business intelligence needs and often come 
up with a mixture of tools. 


The Microsoft BI stack offers a wide variety 
of BI capabilities built into existing Microsoft 
applications such SQL Server and Microsoft 
Office. These capabilities can be combined to 
achieve the ultimate goal of any BI solution: 
Providing smooth, easy access to data needed 
to solve complex business questions. 


Three of the most commonly used pieces of 
the Microsoft BI stack are: 
* Microsoft Office Excel 
* Microsoft SQL Server Reporting Services 
* Microsoft SQL Server Analysis Services 


Microsoft Office Excel 

Companies that want to add a second layer 
to the rich toolset offered by Microsoft SQL 
Server often utilize applications they already 
have in place, such as Microsoft Office Excel. 


_ Using Microsoft Office Excel 
Benefits _ Limitations 


User familiarity PivotTables handle limited 
amount of data_ 


PivotTables can have slow 
processing performance 


Smooth SQL Ever 
integration 


3 Data analysis capable 
| is limited 


Quick implementation 


Leveraging Excel can give most companies 

a basic, quick-to-implement BI solution. 
(Businesses will also fall back on the SQL 
Server-Excel combination because people 
are creatures of habit, and tend to do things 
using familiar tools.) 


In the SQL Server-Excel scenario, users query 
data from a SQL Server database then utilize 
Excel PivotTables to work with the data. 
Often, the amount of data users request from 
SQL Server becomes much too large for Excel 
to handle. Excel can do powerful analysis 

of data, but it is not built to do analysis on 
very large data sets as a standalone tool. As 

a result, complex PivotTables often provide 
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very slow performance, or Excel may fail to 
produce the table in the first place. 


So, users are right when they report that Excel 
has sufficient ability to analyze data. But, 
business decision makers need a solution that 
provides more than the simple analysis they can 
get from a PivotTable. They need access to more 
complex analysis while still taking advantage of 
the familiar tools such as PivotTables. 


SQL Server Reporting Services 

Microsoft SQL Server Reporting Services (SSRS) 
was created to render data into a nice graphic 
display for end users. It was not meant to 
execute queries that perform complex data 
analysis logic before presenting data, nor was 
it intended for complex on-the-fly analysis of 
data before data rendering. 


Using Microsoft SQL Server 
Reporting Services 


Benefits - | Limitations — 
Clear visual display of data Data analysis capability 


is limited 
Transact-SQL can be 
limiting 


Provides robust data results 


| Storage and use of reports 
is decentralized 


Reliability of data in reports 
is compromised 


The queries that supply data for SSRS reports 
can be used for complex analysis, but its use 
of Transact-SQL means SSRS may not be the 
right tool. 


Here's a scenario in which SQL Server 
Reporting Services isn't the most efficient tool 
for the job: 


A database administrator is asked to create 
a simple report that pulls recent client data 
from the main database. So, he creates a 

stored procedure to use as the data source. 


Users love the new report, and over time ask 
for more and more modifications to it. He 
quickly finds that these changes add quite 
a bit of logic to the stored procedure. The 


processing time has now slowed to the point 
that users are starting to complain about the 
amount of time it takes to render the report. 


The "simple" report now requires much more 
maintenance and technical skill than ever 
intended. 


SQL Server Reporting Services is often used 
despite the risks outlined above because, 
much like Excel, it is a familiar tool for end 
users. At this point, businesses should consider 
Microsoft SQL Server Analysis Services (SSAS). 
SSAS is designed with the flexibility to meet 
growing needs within your organization for 
more and more complex data analysis and 
reporting. 


SQL Server Analysis Services 

BI software should be designed to facilitate 
analysis of information, not just static data 
reporting. Inspection, exploration, and 
probing of information are all necessary 
components of the discovery process needed 
for effective decision making. 


SQL Server Analysis Services (SSAS) is a 
powerful tool built specifically for analysis of 
large amounts of data. As SQL Server 2005 
Books Online (http://msdn2.microsoft.com/en- 
us/library/ms175609.aspx) explains: 


"Microsoft SQL Server 2005 Analysis Services 
(SSAS) delivers online analytical processing 
(OLAP) and data mining functionality for 
business intelligence applications. Analysis 
Services supports OLAP by letting you design, 
create, and manage multidimensional 
structures that contain data aggregated 
from other data sources, such as relational 
databases." 


Business decision makers think about data 

in dimensions (for example, sales over time, 
sales by region, or sales by product). SSAS 
enables decision makers to explore data and 
conduct root-cause and historical analysis 
because it can store data from multiple sources 
(including SQL Server and others) in a format 
that is structurally designed for this kind of 


complex data analysis. The formatting extends 
to creating structures such as cubes, which 

is beneficial because cubes show data after 
the analysis is complete. Cubes can then go 
one step further by presenting data in a way 
that is familiar to business decision makers. 
Consequently, this familiarity paired with 
robust data enables them to intuitively explore 
and understand the information presented. 


SSAS also provides you the tools to centrally 
build and maintain business logic such as key 
performance indicators and calculations. This 
is critical to the success of your business in that 
it ensures that you have a single version of the 
truth of the metrics you are monitoring. SSAS 
also provides more advanced analysis through 
predictive analytics. 


Once the SSAS data models are built, 
information can be presented in multiple 
tools including Microsoft Office Excel, 
Microsoft Office SharePoint Server, SQL Server 
Reporting Services, Microsoft Office Visio, 

and Microsoft Office PerformancePoint Server 
2007. You can use this centrally managed 
secure data to conduct information analysis 
and to build dashboards, scorecards, reports, 
and data visualizations. 


SSAS models are flexible enough to work with 
other tools outside of Microsoft BI stack to 
help people quickly monitor and analyze what 
is happening in their organizations. 


Here are some areas in which SSAS proves 
itself to be an effective business intelligence 
option. 


Using Microsoft SQL Server 
Analysis Services 


... Benefits Limitations х 
Provides insight and New tool for end users 
dissection of data to learn 


Initial time and resource 
investment for 
implementation 


Provides complex analysis 
of large amounts of data 


Leverages OLAP, data mining, | 
and cubes | 


Provides advanced data 
display for easy understanding 


Provides data centralization's 
"one version of the truth” | 


oor в 
Streamlines database | 
таїпїепапсе 


Can reduce workload оп 
network and IT resources 


As the comparison shows, SSAS as a single 
tool provides the exact level of data analysis 
decision makers need: a visually simple 
display, data interpretation, and less IT 
maintenance and resource drain. 


Summary 

BI software is designed to analyze and 
present data in a way that helps decision 
makers understand information and make 
decisions quickly. At their best, these tools can 
answer "why" and "how" questions behind 
the numbers presented within a report. The 
best BI options also provide reporting and 
analytics together to provide the perfect 
combination of data distribution and tools 
necessary to understand information. 


Many companies look for a solution to their 
business intelligence needs and often come 
up with a mixture of tools that support the 
various types of decision making within their 
environment. As we've seen, Microsoft Office 
Excel or SQL Server Reporting Services are 
common go-to tools because they are familiar 
and can solve many business intelligence 
questions. 


But, by adding SQL Server Analysis Services 
to your existing reporting and analysis 
environment you gain a more efficient 
infrastructure for streamlined analysis, 
presentation, and interpretation of large 
amounts of data. In the end, SQL Server 
Analysis Services may help reduce network 
load problems and the workload on IT 
departments by making Bl more robust and 
by being more of a self-service tool. 


Randy Dyess, Solid Quality Learning Mentor 
and Program Manager: Strategic Initiatives 
has a variety of experiences dealing with SQL 
Server 2005 over the past nine years and has 
worked with environments with Terabytes of 
data and environments that had over 1,000 
databases with only a few megabytes of data 
in each database. Currently, Randy is the 
founder and owner of Dyess Consulting Inc. 

a SQL Server 2005 mentoring and training 
consulting firm which specializes in training 
and mentoring in Transact-SQL and SQL 
Server 2005 performance tuning and database 
security. Randy is the author of Transact-SQL 
Language Reference Guide and numerous 
magazine and newsletter articles pertaining 
to SQL Server 2005 security and optimization 
issues and has spoken at various international 
and national conferences. 


SSIS, SSAS, and SSRS upgrades 
improve Bl performance 


against tables with row counts of over a million rows 
occasionally causes a performance slowdown. SQL 
Server 2008 no longer has this limitation. You can 
perform a lookup against any data source by using 
the standard providers, which include ADO.NET, 
XML, OLE DB, and other data sources. You can 
even perform lookups against other SSIS packages. 

The enhanced TxLookup transformation compo- 
nent of the SSIS package in SQL Server 2008 supports 
internal redundancy on the lookup chain. TxLookup 
also includes several other improvements over SQL 
Server 2005: There's now a pre-charge query in addi- 
tion to the cache-miss query. And for each cache-miss 
query, multiple rows can now be returned. The cache- 
miss query now has a separate connection manager. 
If you use a full or a partial cache query, SQL Server 
2008 loads the hash table and uses the pre-charge 
query. However, if you use a no cache query, SQL 
Server 2008 behaves like SQL Server 2005 and uses 
only the cache-miss query. SSIS in SQL Server 2008 
improves the performance of lookups to support the 
largest tables. 

Data profiler. Good news for ETL gurus—SQL 
Server 2008 SSIS has a data profiler. Now you'll have 
visibility into the source system data before you build 
your ETL solutions, and the ability to code, config- 
ure, and build based upon data patterns. With the 
data profiler you can generate source system metada- 
ta statistics, which you can then view using the stand- 
alone Data Profile Viewer. This viewer also displays 
candidate keys and data distributions. Data profiling 
has long been a requested capability of DTS/SSIS 
and the larger SQL Server product. It's good to see a 
formal solution. 


SSAS Enhancements 

Following the typical progression in a BI solution, 
Гуе discussed the first stage—ETL and SSIS—and 
now we're ready to look at creating cubes and min- 
ing models. One of SQL Server 2008's many improve- 
ments to the SSAS architecture is Cube Designer 
enhancements. 

Cube Designer enhancements. A critical compo- 
nent to SSAS is the practice of good cube design. The 
ultimate success or failure of your BI rollout depends 
on it. ГЇЇ briefly survey what's new with Personalized 
Extensions, Best Practice Alerts, the Dimensional De- 
signer, the Aggregation Designer, and Named Sets. 

You can use Personalized Extensions to create 
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new SSAS objects and functionality, and then pro- 
vide these objects and functionality dynamically in 
the context of the user session. You don't have to cre- 
ate detailed specifications about where or how to find 
the extended functionality. You can share these new 
objects and functionality immediately with both end 
users and your fellow developers. 

The Cube Designer now has a Best-Practice Alert 
functionality that spans all objects and is generated 
through Analysis Management Objects warnings. 
The warnings alert you when you violate design best 
practices or make logical errors in database design. 
You can detect potential problems with the design in 
a non-intrusive way because these warnings are inte- 
grated into real-time designer checks. 

New and improved features for the Dimensional 
Designer include the Attribute Relationship Design- 
er, a simplified and enhanced Dimension Wizard, and 
the Key Columns dialog box. You can use the new 
Attribute Relationship Designer in the Dimension 
Editor to easily browse and modify attribute relation- 
ships. The Dimension Wizard, which has been modi- 
fied to align output with best practices, auto-detects 
parent-child hierarchies, provides safer default error 
configuration, and supports specification of member 
properties. In the new Key Columns dialog box, the 
enhanced Dimension Structure tab works with the 
Attribute Relationship Designer, making modifying 
attributes and hierarchies easier. 

A new algorithm in the Aggregation Designer 
helps you create initial aggregations. This designer is 
optimized to work with usage-driven aggregations. 
You can view the created aggregations and add to or 
remove them. 

Dynamic named sets are a new capability of SSAS 
2008. A named set in SQL Server 2005 makes it pos- 
sible to define a set of dimension members such as a 
set of the top 10 stores by sales. You define this set 
statically. You can then refer to this named set wher- 
ever you need to see the top 10 stores by sales. In SQL 
Server 2005, set evaluation occurs only at set creation. 
In SQL Server 2008, you can create dynamic named 
sets and define them to be evaluated every time the 
sets are used. 

Performance enhancements. A major portion of 
the SSAS performance enhancements are in areas 
such as subspace computations, Multidimensional 
OLAP (MOLAP)-enabled write-back, and backup 
and storage. 
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Cube space is generally sparse, with values exist- 
ing only for a small number of dimension intersec- 
tions. Although SSAS in SQL Server 2005 evaluates 
expressions on complete space, and subspace compu- 
tation is included with SP2, in SQL Server 2008 SSAS 
subspace computations are significantly improved. 
Multidimensional Expressions query performance 
has improved; SSAS deals better with cube space by 
dividing the space to separate calculated members, 
regular members, and empty space to improve evalu- 
ation of cells that need to be included in calculations. 

The new MOLAP-enabled writeback capabili- 
ties in SQL Server 2008 SSAS remove the need to 
store writeback data in ROLAP storage mode. The 
new writeback MOLAP storage mode results in sig- 
nificant performance gains in cubes that leverage the 
writeback capabilities. 

Finally with SQL Server 2008 SSAS backup 
compression, less storage is required to keep back- 
ups online. The backups also run significantly faster 
because less disk I/O is required. There are fewer 
restrictions on the size of the database, and the 
time required for backup and restore operations is 
significantly reduced. 

Data mining. The next important SSAS element 
for any BI solution is data mining. SQL Server 2008 
SSAS enhances data mining models by appending 
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a new algorithm to the Microsoft Time Series algo- 
rithm. This improves the accuracy and stability of 
predictions in the data mining models. The new al- 
gorithm is based on the Auto-Regressive Integrated 
Moving Average (ARIMA) algorithm, and provides 
better long-term predictions than the Auto Regres- 
sion Trees with Cross Predict (ARTxp) algorithm 
used in SQL Server 2005 SSAS. 

By default, the new implementation of the Micro- 
soft Time Series algorithm uses the ARTxp algorithm 
to train one version of the data mining model and the 
ARIMA algorithm to train another version of the 
data mining model. The algorithm then weighs the re- 
sults of these two data mining models to provide the 
prediction characteristics you want. If you don’t want 
to use the default implementation, you can specify the 
algorithms that the Microsoft Time Series algorithm 
must use. 

In SQL Server 2008 Enterprise Edition, you can 
specify a custom weighting of the algorithms to pro- 
vide the best prediction over a variable time span. 
The improved Microsoft Time Series algorithm ac- 
cepts data during prediction to allow for new busi- 
ness scenarios. For example, you can create a revenue 
prediction model based on averages across products, 
regional aggregates, or some other broad data set. 
You can then apply that model to the time series that 
shows the sales of an individual product. By applying 
the general model, you can take advantage of the sta- 
bility and availability of aggregate data and custom- 
ize prediction to the individual product. You can also 
train models by using multiple series, and then apply 
the models to new data in forecasting scenarios. 


SSRS Enhancements 

Now that we've covered what's new with laying the BI 
groundwork with SSIS and building cubes and min- 
ing models in SSAS, we're ready to review the new 
features and enhancements found in SSRS in SOL 
Server 2008. 

Report Server engine. A report server is now 
implemented as a Windows-based service that hosts 
the Report Manager, the Report Server Web service, 
and background processing feature areas. The report 
engine improves supportability and the ability to con- 
trol server behavior with memory management and 
infrastructure consolidation. Consolidating server 
applications into a single service reduces configura- 
tion and maintenance tasks. However, the Report 
Manager and the Report Server Web service applica- 
tions continue to run independently within the single 
service. Both the Report Manager and the Report 
Server Web service can be accessed through URLs 
that provide HTTP access to these applications. 

The report server includes an HTTP listener that 
handles all authentication requests directed to a URL 
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and a port you define during server configuration. To 
provide the ASP.NET and Report Server Web service, 
the report server uses the new HTTPSYS capabilities 
of the OS instead of IIS. The report server also has 
new management features to set a memory threshold 
for background operations and performance coun- 
ters for monitoring service activity. ГЇЇ briefly explore 
SSRS enhancements for report server deployment 
modes, report authoring, and report designing. 
SSRS continues to expand its delivery options 
with the expansion and enhancement of Rich Text 
Format (RTF), Microsoft Office Word, and Micro- 
soft Office Excel rendering. The improvement of 
the RTF component provides a method for users 
to define mixed formatting in textboxes and import 
marked-up strings of the text into a report generated 
from a database or other data sources. The Microsoft 
Office Word 2007 rendering extension can be used to 
export a report to a Word document without using a 
third-party tool. Finally, the Microsoft Office Excel 


designer outside of this tool. 

Other enhanced features to help you design re- 
ports include: 

* Entity hierarchies that provide a flattened analytic- 
style metadata browser that presents all entities as 
a flattened list. 

* Live data in design view that allows display of 
live data by using simple iteration of design-time 
elements. 

* Instances in metadata browser that extend the 
metadata browser to include instance data. 

* Filtering on the design surface that adds UI 
elements for defining basic filter conditions directly 
on the design surface. 

* An interface that mirrors the Office 2007 
products. 


In SQL Server 2008, reports run 


faster, various queries can execute 
faster, and writebacks in SSAS are 
faster. A handful of brand- 


rendering extension has been enhanced to support 
features such as nested data regions and sub-reports. 
Report authoring. While improved report rendering 


is all well and good, better report authoring capabilities 
bring SQL Server 2008 SSRS to a new level of usabil- 
ity for developers, power users, and end users looking 
for easier report creation. Microsoft has been touting 
the Tablix data region type, which features fixed and 
dynamic columns and rows, arbitrary nesting on rows 
and columns, optional omission of row or column 
headers, and the ability to apply multiple parallel rows 
and column members within the same report. 

Report authoring data visualizations now have 
better visual fidelity between formats and support for 
rich report formats, such as tables and matrices. En- 
hanced features include: 

* Expression placeholder text. Expressions use 
placeholder display text in text boxes on the report 
design surface or in data regions. 

* Expression-based parameter prompts. The 
Prompt property for a report parameter can be an 
expression. 

* Processing-time variables. Variables that are global 
throughout the report or local to a particular group 
can be declared and referred to in expressions. 


The Report Designer has also been upgraded with 
features such as new query constructs to return all in- 
stances in a recursive hierarchy. New query constructs 
support functions such as Rank and Top N. The tool 
has a new UI for obtaining grand totals, and it sup- 
ports cross-joins, which are required for common 
analytic queries. SSRS business users have wanted 
a more user-friendly version of the BI Development 
Studio report designer tool. The SQL Server team 
responded by creating a separate standalone report 
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* Conditional formatting in response to 
customer recommendations. 

* Standalone deployment that helps address is- 
sues that occur during Click-Once deployment. 

* Built-in forms authentication that enables users to 
easily switch between Windows and Forms. 

* Report Server application embedding that enables 
the URLs in reports and subscriptions to point 
back to front-end applications. 


SQL Server 2008 ВІ—15 It for 
Your Organization? 
As I’ve mentioned earlier, I see scalability and perfor- 
mance as the most significant areas of improvement 
in SQL Server 2008. Reports run faster, various que- 
ries can execute faster, and writebacks in SSAS are 
faster. A handful of brand-new capabilities, such as 
the Data Profiler in SSIS, may also make you think 
seriously about migration. Overall, SQL Server 2008 
is an evolutionary upgrade which provides a better 
performing BI platform. SQL] 
InstantDoc ID 98467 
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Creating Dimensions 


Make the 
best cubes to 


help your users analyze data fast 


reating the best possible cubes in SQL Server 
Analysis Services (SSAS) 2005 and 2008 requires 
good dimension design, including creating the 
proper attributes and making meaningful hierarchies 
with those attributes. Well-designed dimensions ensure 
that the data in cubes calculates correctly so users can 
analyze that data and turn it into useful information. 
Let’s explore how to create dimensions in SSAS. 
Then, in the Web-exclusive follow-up article “Creating 
Dimensions in SSAS Part 2,” InstantDoc ID 98699, 
we'll look at more aspects of dimensions, including 
creating a cube and analyzing attribute relationships. 


Create a Data Source 
The first step in working with an SSAS project is to 
create a data source. I use the AdventureWorksDW 
database, one of the sample databases available for 
SQL Server 2005 and SQL Server 2008. Next, you 
need to create a data source view (DSV), which 


Create the Product Dimension 
Right-click the Dimensions folder in the Solution 
Explorer and choose New Dimension to launch the 
dimension wizard. The first page enables users to build 
a dimension with or without a data source. Normally 
you build a dimension with a data source, and when 
you select this option, you'll see a check box for auto- 
matically building attributes and hierarchies (although 
this can be changed to create just attributes.) Accept 
the defaults and click the Next button to advance the 
wizard to the page for selecting a DSV. This project has 
only one DSV, so you simply click Next. 

The wizard then asks you to select the dimension 
type: Standard, Time, or Server Time. The Time 
dimension option adds an extra step, which ГЇЇ cover 
in a moment. The Server Time dimension creates a 
dimension table based on a start date, end date, and a 
selection of levels. Standard dimensions, to the wizard, 
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Figure 1 shows. А DSV is a logical representa- 
tion of a schema and includes tables or views 
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from one or more databases, queries that act like 
views but only exist in the DSV, and more. For 
this example, ГЇЇ add the following tables to the 
DSV: DimProduct, DimProductCategory, Dim- 
ProductSubCategory, DimTime, and FactInt- 
ernetSales. This makes a very simple snowflake 
schema that will have two dimensions: Time and 
Product. 

At this point, you have a choice: You can run 
either the cube wizard or the dimension wizard. 
The cube wizard creates one or more of the dimen- 
sions if they don’t already exist, then proceeds to 
create the cube. The dimension wizard walks you 
through the process of creating dimensions one at 
a time, and of course doesn't create cubes. To better 
explain the process and show what's being created, 
I use the dimension wizard to create the Product 
and Time dimensions. 


SQL Server Magazine * www.sqlmag.com 


+ 


36 
Сает paqject Dera вака: ирги — praget сте эя пастай. 


FEATURE 


Craig Utley 
(craig @ solidg.com) is a mentor with Solid 


Quality Mentors and a former program man- 

ager with the SQL Customer Advisory Team at 
Microsoft. He is the author of Business Intel- 

ligence with Microsoft Office PerformancePoint 
Server 2007 (Osborne/McGraw-Hill). 


[ pe Bee awe Бена Bed Qe. Мени Dumb Dua рене View Tego Тон dom (Сеет Hrs 
dw» 


m 


Figure 1 


Data source 
4 view 


May 2008 21 


№ Dimension Wizerd 


CREATING DIMENSIONS 


| Review New Hierarchies 


Selected hierarchies wil be included in the dimension. Clear the check box next to hierarchies 
that you do not want to include. 


New hierarchies: 


F Calendar Year - Calendw Quarter - Month Number Of Year - Full Date Alternate Key 


4 Calenda Vear 

J Calender Quarter 

4 Month Nurnber Of Year 
94 Full Date Aftemate Key 


19 


Figure 2 


Time dimension 
hierarchy 


22 May 2008 


are anything that isn’t a time dimension. Most cubes 
will have a Time dimension and several standard 
dimensions, as is the case here. Select the Standard 
dimension option and click Next. 

Choose the main dimension table. After you select 
the table, the columns are listed and the key column, if 
it can be determined, is checked. Here you can change 
the key column as necessary. By changing the column 
name, the actual value will continue to be the key, but 
the user will see a more familiar, descriptive value. 

After you click Next, you'll see a screen verifying 
related tables. This screen appears only when the 
dimension is made up of multiple tables, as is the case 
with a snowflake schema. 

Click Next to advance to the Select Dimension 
Attributes page. This page lists all the columns in the 
table(s) making up the dimension. In SSAS, each column 
becomes an attribute and can be used for analysis inde- 
pendently from any other attribute. For products, this 
means that users can analyze by such attributes as size, 
color, and weight, without the need to create separate 
dimensions. The ability to analyze by any attribute is 
extremely powerful but can be confusing for end users 
faced with dozens of attributes. The cube developer can 
remove attributes at this stage and also hide attributes in 
the dimension after the dimension has been created. 

The next screen in the dimension wizard asks for 
a dimension type; most dimensions will work fine as 
regular dimensions, so select Regular and move to the 
next screen. This screen asks if the dimension contains 
a parent-child attribute, which it doesn't, so it's safe to 
continue to the next screen. 

The wizard now attempts to detect hierarchies, but 
fails to find any in this case. That's too bad because there 
is a clear hierarchy here (ProductCategory to Product- 
SubCategory to Product), but you have to create it 
manually after the wizard is done. Click the Next button 
a couple of times until you see the Completing the Wizard 
screen. This screen shows the completed dimension with 
the attributes and, if any are found, the hierarchies. Here 
you can rename the dimension if desired; many users 
prefer to drop the word “Dim” from the front of the 
dimension, naming it simply “Product.” Click Finish to 
create the dimension. 


Create the Time Dimension 

Now let’s move on to the Time dimension and look at 
the differences between a time and a non-time, or stan- 
dard, dimension. The initial process is the same: Right- 
click the Dimensions folder in the Solution Explorer 
and choose New Dimension, accept the defaults to 
have the dimension built with a data source, and select 
the AdventureWorksDW DSV. 

The next screen is the Select the Dimension Type 
page, and this time you click Time dimension and select 
dbo_DimTime in the drop-down list. The wizard shows 
an extra screen called Define Time Periods, where you 
assign various columns in the Time dimension table to 
time properties. I assigned only four of the columns to 
keep the example simple. 

Now the wizard displays the hierarchies it’s identi- 
fied. The Time dimension almost always has at least 
one hierarchy if you’ve assigned columns to the time 
properties. Figure 2 shows the hierarchy resulting 
from the four columns I assigned. It contains the 
unfortunate name “Calendar Year — Calendar Quarter 
— Month Number Of Year - Full Date Alternate Key.” 
ГЇЇ change it to just “Calendar” later. You can also 
change the names of the levels within the hierarchy, 
remove the entire hierarchy, or remove certain levels 
within the hierarchy, as needed. 

Now our project has two dimensions. We haven't 
yet created a cube, but you can still process 
one or both of the dimensions, which involves 
reading the data from the dimension tables and 
building the dimension structure. After processing, 
verify the dimension structure by browsing the 
data in the dimension using the Browser tab at the 
top. In my example, I see a warning symbol next 
to the dimension name—this isn't caused by the 
overly long name but indicates a fundamental problem 
with dimensions, which we'll examine in Part 2. 


Good Dimensions, 

Good Data Warehouses 

When building a data warehouse, some developers 
new to SSAS downplay the need for proper dimen- 
sional modeling in the relational database, figuring 
that the tools in SSAS can overcome any deficiencies 
in the underlying relational schema. Although SSAS 
can create dimensions from a normalized schema, a 
relational data warehouse significantly simplifies the 
creation of dimensions and cubes in SSAS and also 
ensures cleaner data. For more information about 
dimensions and data warehousing, see this article's 
online Learning Path at InstantDoc ID 98510. Now 
youre ready to move on to Part 2 on the Web 
(InstantDoc ID 98699), where we'll look at the reason 
for the warning symbol, create a cube, and discuss how 
attribute relationships help users process data. В 


InstantDoc ID 98510 
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Create special dimensions to hold miscellaneous 
attributes found in the source database 


hen you're designing a data warehouse, 
үү you'll occasionally run into attributes from 

the source database(s) that don’t fit into 
neat, tight star schemas. We've all seen OLTP tables 
that are full of flag fields and yes/no attributes, many 
of which are used for operational support and have no 
documentation except for the column names and the 
memory banks of the person who created them. So 
how should you handle open-ended text and comment 
attributes, many of which are badly designed in the 
OLTP schemas? Not only do those types of attributes 
not integrate easily into conventional dimensions such 
as Customer, Vendor, Time, Location, and Product, 
but you also don’t want to carry bad design into the 
data warehouse. However, some of the miscellaneous 
attributes will contain data that has significant business 
value, so you have to do something with them. 
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There are three conventional ways to deal with 
these attributes: discard all of the miscellaneous attri- 
butes, eliminating them from the dimensional design; 
incorporate the miscellaneous attributes into the fact 
table; or make each miscellaneous attribute a separate 
dimension. However, all of these options are less than 
ideal. Discarding the data can be dangerous because 
the miscellaneous values, flags, and yes/no fields might 
contain valuable business data. Including the miscel- 
laneous attributes in the fact table could cause the fact 
table to swell to alarming proportions, especially if you 
have more than just a few miscellaneous attributes. 
The increased size of the fact table could cause serious 
performance problems because of the reduced number 
of records per physical I/O. Even if you tried to index 
these fields to minimize the performance problems, you 
still wouldn't gain anything because so many of the 
miscellaneous fields contain flag values such as 0 and 
1; Y and N; or open, pending, and closed. (For more 
information about indexing, see "Indexing Dos and 
Don'ts,” January 2003, InstantDoc ID 27334.) And 
if you make each miscellaneous attribute a separate 
dimension, it will most likely result in a complicated 
dimensional design. For example, a star schema for 
a shop floor or manufacturing activity would include 
the standard dimensions of people (e.g., customers, 
vendors, employees), time, location, and inventory or 
product, which is a tight, straightforward design for 
a four-dimensional cube. But once you start incorpo- 
rating 10 or 20 additional dimensions, each of which 
is a yes/no attribute, a status field, or an open-ended 
comment, you're looking at a much more complicated 
star schema and associated cube. 


Determining the Value of 
Miscellaneous Attributes 
You can determine which miscellaneous attributes 
have business value through a process called discovery. 
You have to ask the right people the right questions to 
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determine if these miscellaneous flag and text attributes 
are truly useful, and you can’t be satisfied with only 
one person’s opinion. For example, someone in the 
production department might consider the open-ended 
comments to be truly important, while someone in the 
sales department might not think so. Your discovery 
process has to be all-encompassing and thorough. 
You must understand how the data in transactional 
databases is used and for whom it has value before you 
can determine if the miscellaneous attributes should 
be retained or discarded. Here are my suggestions for 
how to handle flag or comment attributes that have 
business value. 


Handling Flag Fields and 

Yes/No Attributes 

Tm sure you've seen many examples of flag fields and 
yes/no data that have no documentation regarding how 
they're used. This scenario is especially common in 
legacy systems and databases that were created without 
solid, underlying design principles. Column names such 
as Completed, Packed, Shipped, Received, Delivered, 
and Returned (each with yes/no data values) are very 
common, and they do have business value. Instead of 
discarding flag fields and yes/no attributes, I suggest 
placing them all into a junk dimension that's organized 
as shown in Figure 1. 


xt Junk Dimension 
JunkKE Y TextJunkKEY 


[— Shipped | 
[— Retumed | 


Refunded 


Figure 2 

The junk dimension shown in Figure 1 represents 
an order-fulfillment system; the column headers 
show some of the possible statuses an item that has 
been ordered can have. Row 1 indicates that the item 
ordered has been picked out of the warehouse, packed 
for shipment, shipped, delivered, received, returned 
for a refund, and restocked in the warehouse. Row 
9 shows an item on order that's waiting to begin 
the order-fulfillment process The rows in between 


Associating junk 


FACT table 
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indicate items that are in various stages of the order- 
fulfillment process. This example is very simple 
because the process is so linear and sequential, and 
NULL conditions aren't allowed in this transactional 
database. As in any dimensional design, each of the 
rows in the fact table will be associated with a row in 
this junk dimension. Even if your set of miscellaneous 
attributes isn't as sequential or you have to create a 
dimension that contains all possible combinations 
of yes and no, there will still be only 256 records in 
the entire dimension. As with all dimensional design, 
don't forget to add an identity column to the junk 
dimension as a primary key and to include the junk 
key column in the fact table as a foreign key. 


Handling Comment and 
Open-ended Text Attributes 

You can handle comment data and free-form, open- 
ended text fields by creating a special text-based junk 
dimension. If the verbiage in these fields is potentially 
valuable, create the text-based junk dimension with 
two columns, the key column and the text column, as 
shown in Figure 2. Include the text junk key column 
in the fact table as a foreign key, and don’t forget to 
add a “no comment” record to the text-based junk 
dimension for those facts that have no associated text. 
Most likely, these text fields in the source systems 
will be used only sporadically, so the text-based junk 
dimension will be much smaller than its associated 
fact table. Figure 2 also shows how the yes/no attri- 
bute junk dimension described earlier will relate to 
the fact table. 


Keeping Your Data Warehouse 
Design Simple 
You want to keep the data warehouse design as simple 
and straightforward as possible, so that users will be 
able to access data easily. Miscellaneous attributes 
that contain business value are a challenge to include 
in your data warehouse design because they don’t fit 
neatly into conventional dimensions, and if improperly 
handled, can cause the data warehouse to swell in size 
and perform suboptimally. By placing miscellaneous 
attributes into junk dimensions, you can circumvent 
both of these problems. ЕП 
InstantDoc 10 98356 
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Tools and techniques to help you discover 
SQL Server security weaknesses 


ne of the best features of SQL Server 2005 is 
Q that it's secure by default: A pristine installa- 
tion (1.е., one with the defaults largely intact) 
is about as secure as it will ever be. But then DBAs and 
developers mess up this pristine security by creating 
and installing databases, giving users and groups access 
to data, and building Web applications that indirectly 
give the untrusted masses access to sensitive enterprise 
data. Suddenly, SQL Server is a security nightmare. 
You can do all those things securely, of course, if 
you're careful and monitor the entire attack surface. 
But diligent monitoring can be a full-time occupation 
for several people even in small enterprises, and it's all 
too easy to miss obscure vulnerabilities. 
That's where tools come in. Over the last few years, 
dozens of tools have appeared to simplify the job 


tools perform a gamut of security-testing functions: 
They locate your SQL Server instances, assess your 
network security for conformance with best practices, 
crack your passwords, perform a vulnerability analysis, 
and keep your system updated. I've found the tools 
especially useful for finding the holes in a SQL Server 
instance and identifying vulnerabilities. 


Before We Start... 

...IIl assume that you're familiar with SQL Server's 
built-in security tools. You should be well acquainted 
with the capabilities of the Surface Area Configuration 
tool and SQL Server Configuration Manager as well as 
SQL Server Management Studio (SSMS). SQL Server 
2005 has some marvelous security features that you 
should put to use. You shouldn't use external tools as 


\ 
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donkiely @ computer.org), MVP, MSCD, is a 


senior technology consultant specializing in 
developing secure desktop and Web applica- 
tions that integrate databases and related 
technologies. When he isn't writing software, 


he’s writing and speaking about it. 


of anyone concerned 


with the security of a TABLE |: Security Tool Resources 

SQL Server system and 

its databases. The tools 1901 Purpose 

run the gamut from free Absinthe Performs SQL injection testing 

to expensive, single to — | Cain & Abel Provides industrial-strength password recovery 
general purpose, simple (cracking) for just about all Microsoft OS and 

to complex. Although product passwords, including SQL Server 

most are intended for | Metasploit Project Powerful, general-purpose tool that does penetration 


the good guys, you can 
be sure that the bad 


Microsoft Baseline 


testing and exploit research 
General security analyzer that finds vulnerabilities 


guys are using the same Security Analyzer and unpatched products on machines. Only support 

tools to probe and poke for SQL Server 2005 is to find missing patches. 

at your ais Over | SQL Server 2005 Best Analyzes selected components of one or more 

the years, I've found | Practices Analyzer instances of SQL Server 2005 for known security 

and used a number of vulnerabilities 

security tools; Table 1 ШС crack SQL Server password cracker that uses a variety of 

lists some of my favor- techniques to search for easily cracked passwords 

dodi SpA DU NGSSQuirreL for SAL  Security-scanning tool for finding many kinds of 

youto SENSIA foos^ve Manu vulnerabilities in SQL Server 

found especially useful : Е я Е 
SQLPing Finds all instances of SQL Server and does rudimen- 


for boosting SQL Server 
security. Together, the 
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URL 
www.0x90.org/releases/absinthe 


www.oxid.it/cain.html 


www.metasploit.com 


www.microsoft.com/technet/security/tools/ 
mbsahome.mspx 


www.microsoft.com/downloads/details.aspx? 
familyid=da0531e4-e94c-4991-82fa- 
f0e3fbd05e63&displaylang=en 
www.ngssoftware.com/products/database- 
security/ngs-sqlcrack.php#features 
www.ngssoftware.com/products/database- 
security/ngs-squirrel-sgl.ph 


www.sglsecurity.com/tools/freetools/tabid/65/ 
default.aspx 
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a substitute for careful security planning. Start secure 
and use the tools to strengthen and fine-tune security. 

I also avoid discussing SQL injection and cross-site 
scripting (XSS) testing tools, such as Absinthe and 
others. SQL injection and XSS are UI vulnerabilities, 
rather than direct attacks on the database server. Think 
about it: From the perspective of the server, how can 
SQL Server distinguish friendly versus malicious code? 
By the time the code arrives at the server, all context 
that could help to identify it has vanished from the 
database request, other than what connection it's 
coming in on. SQL injection is a serious threat, but it 
isn't a server vulnerability as such, so I won't cover it 
here. Vulnerabilities in the server can enable these kinds 
of attacks, however. 


Find SQL Server Instances 
You can't secure servers you don't know about—or 
have forgotten to add to your audit list—and that's 


where the free SQLPing 3.0 from SQLSecurity.com 


comes in. Ever since the SQLsnake worm appeared, 
we know that the SQL Server Browser service is a 
security vulnerability, exposing information about 
available database servers to attackers. As a result, 
finding SQL Server instances running on your network 
can be problematic. There are several known ways to 
scan and find running instances, and SQLPing takes 
advantage of them all. 

SQLPing is easy to use, which is good since there's 
no documentation for the tool beyond the descrip- 
tion on the SQLSecurity.com Web site. You can 
perform active scans in which the tool actively pings 
the network—noisily announcing its presence—or a 
stealth scan where it simply searches Active Directory 
(AD) for any SQL Server registrations and checks the 
SQL Server Browser Service to see whether any servers 
have broadcast their existence. Active scans are more 
accurate and more clearly reveal activity on your SQL 
Server network, so in general you should use active 


scans. As a bonus, SQLPing will also perform dic- 
tionary and brute-force password checking. However, 
the brute-force password checks are less robust than 
those of other tools. 

Figure 1 shows the results of scanning a small net- 
work. In this case, the scan successfully found all the 
SQL Server instances on two machines, RiverChaser 
and Puppy, and extracted information about them. 
In my experience using SQLPing, I’ve found that it 
almost always finds every instance when other tools, 
even SSMS, find nothing. 


Security Best Practices 

One you've found all the SOL Server instances on 
your network, it's time to get to work evaluating how 
secure they are. So the very next thing you should do 
is run Microsofts SOL Server 2005 Best Practices 
Analyzer. This free, easy-to-use tool works surprisingly 
well for catching all the low-hanging fruit of security 
vulnerabilities. 

The Best Practices Analyzer lets you select which 
SQL Server instances to scan on the local machine. You 
can scan other machines on the network, but because 
the tool accesses the registries and other resources, you 
get a better scan when you run it locally. If you do want 
to scan across the network, you'll probably need to be 
a domain or local administrator with permissions on 
the remote machine's registry. The tool has various 
options for selecting which components to scan for in 
each instance and can import or export component 
lists. You can also select which databases to include in 
the scan; the default is to scan all databases, including 
the system databases. The analyzer defines a large set 
of rules that define best practices, and you can control 
which rules it uses to scan a particular server. 

Once you've set the analyzer's options, click Scan 
Selected Components to start the scan. The scan can 
take anywhere from a couple minutes to a very long 
time, depending on the number of server instances 
and components you select. The scan checks more 
than 100 server and database issues related to known 
vulnerabilities, then produces a report similar to that 
in Figure 2, page 31. Each issue discovered includes 
a brief description, often a link to the Help file, and 
an option to stop checking the rule for any or all SOL 
Server instances for future scans, when appropriate. 
You can enable any rules you've disabled by selecting 
Other Reports on the View Best Practices Report page 
to view the Hidden Item reports. That's not exactly an 
intuitive location, and I have to hunt for it every time 
I need it. Unfortunately, the Help documentation only 
tells you to open the Disable Issues list but doesn't say 
how to find this nonexistent list. 

The scan rules are briefly documented in the Help 
file under the misleading section name Microsoft SQL 
Server Best Practices Analyzer — Articles. Although 
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Microsoft SQL Server 2005 Bast Practices Analyzer P отн Serer Sater 


describes each rule and the best practices associated 
with each. So far, Гуе found that the best practices 
are indeed good advice, although sometimes you 
might have valid reasons to violate a practice. 

The analyzer’s UI is reasonably simple, but its 
modified wizard interface is a bit klunky if you 
want to go back a step and is sometimes confusing 
about where to find various options. But after 
you've used the Analyzer wizard a few times, 
you'll get the hang of how it works. Probably the 
biggest downside to the Best Practices Analyzer 
is that you can’t customize the predefined rules it 
scans for. You can tell it to ignore selected rules, 
but that’s the extent of the customizations you can 
make. Nevertheless, make it a habit to run the Best 
Practices Analyzer regularly to keep tabs on the 
most common vulnerabilities. 

Another Microsoft tool you might come 
across is the Microsoft Baseline Security Analyzer 
(MBSA), which claims to support SQL Server. 
Although this tool is fine for general security analysis 
for a machine, the latest version 2.0.1 supports SQL 
Server 2000 only and is of limited use for SQL Server 
2005 machines. MBSA’s only support for SQL Server 
2005 is to make sure that you have the latest patches, 
which certainly is a useful feature by itself. 


Cracking Passwords 

Strong passwords are the foundation of a secure 
server. It’s a rare SQL Server instance that can get 
away with using Integrated Windows authentication 
alone, so you probably have lots of SQL Server logins 
with weak passwords. Many SQL Server password- 
cracking tools are available, but NGSSQLCrack from 
Next Generation Security Software is probably the 
easiest to use. This is a commercial tool and costs 
around $500 depending on how you license it and 
the current exchange rate (NGSSoftware is based in 
England). NGSSQLCrack will connect to the SQL 
Server instance of your choice and grab the SQL login 
password hashes, or you can either enter the password 
hashes manually or copy them into the tool. NGSSQL- 
Crack relies on both dictionary and brute-force attacks 
and provides some simple options for customizing 
the session. For example, you can specify your own 
dictionary file and specify the character set—including 
case-insensitive options—for the brute-force attacks. 
NGSSoftware’s Web site says NGSSQLCrack works 
with SQL Server 7.0 and 2000, but it worked just fine 
on SQL Server 2005 passwords for me. 

It can take a long time to perform a complete 
crack, depending on the size of your dictionary file, 
the character set you select for the brute-force analysis, 
and the password size range you select. The tool reports 
any passwords it discovers immediately, as you can 
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Figure 2 


Results of a SQL Server 
2005 Best Practices 
Analyzer scan of a local 
instance 


see in Figure 3, so you can take whatever action you 
want without waiting for the session to finish. Figure 
3 shows the session after only a few minutes, already 
with a successful dictionary crack. After NGSSQL- 
Crack had run for hours on my system, I was relieved 
that it still hadn’t cracked the strong passwords for sa 
and carol. 

NGSSoftware claims that NGSSQLCrack isn’t a 
hacker’s tool, since you need administrative access to a 
machine to get the password hashes for cracking. But 
it’s all too easy to gain such access through applica- 
tions, such as by using SQL injection. Once an attacker 
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has cracked some of your passwords, all kinds of nasty Figure 3 
attacks become possible. At that point, you might as NGSSQLCrack running 
well just post your data on your Web site for all the и 


world to see. 

If you want to get into industrial-strength password 
cracking, the tool of choice is the free, cross-platform 
Cain & Abel. This tool gives you many more options than 
NGSSQLCrack for gathering, sniffing, and cracking all 
kinds of passwords—from Windows and other OSs 
as well as SQL Server—along with much more robust 
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cracking options. Cain & Abel is a true hacker’s tool, 
and you'll probably need to spend some time figuring out 
the tool and learning how to use it effectively. It’s almost 
scary how well Cain & Abel can crack passwords, so 
much so that you'll never again create а simple or short 
password for any use whatsoever. 

The choice between NGSSQLCrack and Cain & 
Abel is a matter of cost and ease of use. NGSSQL- 
Crack makes the whole cracking process easy but is 
expensive. Cain & Abel is free and has more power and 
flexibility but is also more complex and harder to learn. 
Overall, the results seem to be similar. 


Industrial-Strength 
Vulnerability Analysis 

Many SQL Server hacking tools are niche products, 
focusing on one aspect of security such as password 
strength or port visibility. But there are literally hundreds 
of potential vulnerabilities in a product as complex as 
SQL Server, and it would take the most diligent adminis- 
trator years to find all the problems. That's where a com- 
prehensive, industrial-strength 
vulnerability scanner is a lifesaver. 
Many such commercial vulner- 


NGSS Q uirreL scan ability scanners are available, most 


on a remote server, 


of which are general network 
analyzers that happen to include 


| was distressed to see scans of SQL Server instances. 


how many vulnerabilities 
it found—on а The SQL Server-specific features 
production server! 
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These include commercial, open- 
source, and freeware products. 


of these products are often fairly 
insubstantial, but such products 
do provide a full set of tools for monitoring all interac- 
tions the server makes with the network. And often these 
products provide the infrastructure you need to develop 
custom attacks and scans. 

The heavyweight entry in this group of products 
is the Metasploit Project. As its Web site describes it, 
Metasploit is an “open source platform for developing, 
testing, and using exploit code.” A key part of the 
project is the Metasploit Framework, a development 
platform that supports creating both security tools and 
exploits. The framework is largely the reason for Metas- 
ploit’s wide use by both the good and bad guys, since it’s 
relatively easy to adapt the tools for specific purposes. 
Over the years, many of SQL Servers vulnerabilities 
have been discovered using these tools. Metasploit isn't 
for the faint of heart—you have to be really focused and 
dedicated to learning to use it effectively—but it's incred- 
ibly powerful. Unfortunately, much of that power is used 
forevil, and you can bet it's being used right now on your 
servers. At the very least, you should assume that it is! 

SQL Server-specific vulnerability scanners are less 
common than the general network analyzers, but NGS- 
Software offers one: NGSSQuirreL for SQL Server. This 


is a powerful SQL Server security analyzer that performs 
more than 700 tests to find most of the known vulner- 
abilities in various SQL Server versions. The product 
is a bit picky about getting the connection and login 
credentials just right before starting a scan; it took me 
about a half dozen tries to configure everything cor- 
rectly to make a successful connection for a scan. Other 
applications, including a local version of SSMS, had no 
trouble connecting to the server I wanted to scan, so m 
not sure what the problem was. 

Once you’ve set up NGSSQuirreL correctly on 
your system, start the scan and go get some coffee. By 
the time you get a cup of coffee and return to your 
desk, the scan should have finished—that's surprisingly 
quick and what you can expect for an NGSSQuirreL 
scan, even on a remote server over a broadband con- 
nection near the low end of the speed range. After 
NGSSQuirreL finishes the scan, it displays an easily 
navigated treeview containing a lot of information 
about the SQL Server instance as well as the problems 
the tool found. When I ran an NGSSQuirreL scan ona 
remote server, I was distressed to see how many vulner- 
abilities it found—on a production server! Each item 
in the scan results list has plenty of information about 
the problem and what to do about it, along with lists 
of affected database or server objects, as needed. Not 
every problem that NGSSQuirreL finds means you 
have a serious security vulnerability, but taken together, 
they can indicate a server's potential vulnerability. 


The No-Brainer Security Tool 
Finally we come to the very best SQL Server security 
tool of all, one that’s essential to run regularly to ensure 
secure database servers. But the tool—Microsoft 
Update—isn’t exactly a hacker tool. A fully patched 
machine is one of your best defenses against new 
attacks. It’s gotten so bad that Microsoft’s second 
Tuesday of the month—Patch Tuesday—is often fol- 
lowed by Black Wednesday as attackers develop new 
attacks overnight after Microsoft releases the details 
of newly patched vulnerabilities. Of course, you need 
to test all SQL Server updates before deploying them 
to production servers. And don’t use Windows Update, 
which doesn’t have nearly the reach of Microsoft 
Update. Third-party tools that perform similar func- 
tions to Microsoft Update are available as well. 


One Step Ahead of Hackers 

In this age of increasingly clever attacks on our data- 
base servers, administrators have to be diligent about 
monitoring and testing the security of their SQL Server 
machines. You can strengthen your database defenses 
by using the tools I’ve described or similar ones to find 
out what hackers already know about your databases 
and servers. [SQL 
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Indexes on Your Tables 


Make your SQL Server indexes more useful 


| often see too many indexes оп а given table that turn 
out to be redundant or less useful than they might 
seem to be at first. If you have more indexes than you 
really need, those indexes can be potentially harmful 
in terms of performance during Inserts, Updates, 
and Deletes, and they can increase the possibility of 
deadlocks. 

I want to explore some of those reasons and show 
you how to identify the candidates for removal. To do 
so, Г use the DupeIndex script (which you can down- 
load from www.sglservermag.com, InstantDoc ID 

.98357), which will generate a small report when I run 
it against my copy of the Adventureworks database. To 
highlight areas relevant to this discussion, I’ve included 
a few additional indexes in the database. For a little 
background about how to look for missing indexes, 
how to use a SQL Server DMV to examine the statistics 
of your existing indexes, and how to set a handle on 
which ones are in use, see the Learning Path. 


Is It That Simple? 

Take a look at Table 1, which shows the second portion 
of a report that outlines potential duplicate indexes. 
The report simply observes all indexes for each table 
and determines whether any have leading columns that 
are the same. Of course, matching leading columns 
don't always mean that a certain index is duplicate or 
useless. Often, a compound index can have the same 
leading column as another index but be more selec- 
tive. Or, the index might be a covering 


it's clustered means that there’ll never be an additional 
lookup to satisfy the columns in the SELECT list. 
Therefore, it's highly unlikely that you'd ever need the 
index on VendorID and AccountNumber because the 
engine can filter via the AccountNumber after it finds 
the row via the VendorID. 

An exception is possible if the combination of 
VendorID and AccountNumber has been declared as 
a Unique Constraint. In that case, the combination is 
serving to enforce uniqueness even though it might never 
be used in a lookup. Even if this index was useful in our 
case, we certainly don't need two of them. As you can see 
in Table 1, there are two identical non-clustered indexes 
on VendorID and AccountNumber—clearly unneces- 
sary. Truly duplicate indexes provide no advantages what- 
soever and will only add to the overhead associated with 
changes to the table and index maintenance. Another 
redundancy—this time with only a single column in the 
index—is on the BillOfMaterials table and the Unit- 
MeasureCode column. Again, these two indexes are true 
duplicates and only one of them might be necessary. 

What about the indexes on the CultureID column 
of the Culture table? One is a clustered index and the 
other is non-clustered. The clustered index also hap- 
pens to be the Primary Key constraint. Many people 
believe that a situation such as this is warranted, that 
you need an actual index that's searchable in addition 
to the constraint. This myth is long-propagated. Con- 
straints such as Primary Key and Unique create an 


index—that is, an index that includes all TABLE 1: Potential Duplicate Indexes 


the columns necessary to satisfy both 
the WHERE clause and the SELECT 
list, and negates the need to access the 
table itself. 


Table Name 


Now, let's examine a couple situ- | Culture 
ations in which the indexes might be Culture 
redundant. If you look at the indexes 
for the Vendor table, you can see that Vendor 
the VendorID column is the Primary 
Key (PK) and has a Clustered Index. Vendor 
That tells you that the index will never Vendor 
return more than one row if you include m 
the VendorID in the WHERE clause 

Vendor 


with an equality expression; the fact that 
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BillOfMaterials 


Index Name Index Type Constraint 
IX. BillOfMaterials NONCLUSTERED 

IX Test ВОМ NONCLUSTERED 

IX Test. Culture NONCLUSTERED 

PK Culture CulturelD ^ CLUSTERED PRIMARY KEY 
AK Vendor Account- ^ NONCLUSTERED UNIQUE 
Number 

IX Test3 Vendor NONCLUSTERED 

PK Vendor VendorlD CLUSTERED PRIMARY KEY 
IX Test 1 Vendor NONCLUSTERED 

IX Test2 Vendor NONCLUSTERED 
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All Columns 
UnitMeasureCode 
UnitMeasureCode 
CulturelD 
CulturelD 
AccountNumber 


AccountNumber, VendorlD 
VendorlD 

VendorlD, AccountNumber 
VendorlD, AccountNumber 
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TABLE 2: Reverse Indexes 


Table Index Name Index Type ~All Columns 
Vendor — IX Test2 Vendor — NONCLUSTERED VendorlD, AccountNumber 
Vendor —— IX Test3 Vendor © NONCLUSTERED AccountNumber, VendorlD 
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“A SQL Server 2005 DMV Cleans Up Your Indexes,” | 
InstantDoc ID 97479. 
“Use Missing-Index Groups for Query Tuning,” 
InstantDoc ID 95220. 
“Uncovering Missing Indexes," InstantDoc ID 


index behind the scenes to enforce the constraints and 
are usable in searches, just like any other index. 

Creating an index on the same expression as the Pri- 
mary Key constraint is the most common concern I see 
related to duplicate indexes. The second most common 
Is creating a nonclustered index on the same expression 
as the clustered index. If you have a clustered index 
already defined on a particular column, there's no need 
for a non-clustered index on the same column. 


Put It in 
Reverse 

Table 2 is a sample of 
what you might see in the 
Reverse Indexes section 
of the report. In a reverse 
| 


index, the columns of two 
indexes on a table are the 
same but in reverse order 
in the index expression. 


Audit Data changes, 


ecover without a backup 


the U timate Log Reading, Auditing and 
Recovery tool for SQL Server 


у Recover truncated, deleted or modified data 


у Recover data from corrupted database files 


у Recover Dropped tables 


у Analyze row history for historical changes 
у Generate UNDO and REDO scripts 
у Export transactions to XML, BULK SQL etc 


у Selectively recover data without a backup. 


34 May 2008 


ApexSQL 


software 


# 


In this case, there are two columns—VendorID апа 
AccountNumber—on the Vendor table. If all the 
queries that referenced either of these columns in the 
WHERE clause also included the other, these would 
be true reverse duplicate indexes. If both columns are 
specified, the order in the WHERE clause or index 
expression isn’t a factor because the engine is smart 
enough to arrange the lookup to match the index 
expression. 

However, if you specified only one of the columns, it 
would need to be the first column in the index expression 
in order to do a seek. In this particular example, it might 
be best to create two single-column indexes on VendorID 
and AccountNumber instead of compound indexes. Or, 
if you frequently query on two columns together and 
sometimes on just VendorID, you can have a single index 
on VendorID, AccountNumber to satisfy both queries. 


Yowre the Judge 

Remember that this simple report can’t replace a 

proper understanding of your data model and table 

usage. It can merely outline potential candidates for 

removal; you must be the final judge and apply good 

reasoning before taking action. ЕЙ 
InstantDoc ID 98357_ 
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How to use SELECT statements 


to aggregate data 


their simplest form, SELECT statements 

let you retrieve data stored in databases. 

However, you can use them for so much 
more, including aggregating data. Aggregating data 
simply means bringing data together and summarizing 
it. Aggregate functions available in T-SQL include 
COUNT, MIN, MAX, AVG, and SUM. There are 
other functions, but for now, ГЇЇ show you how to use 
these five. 


The Prerequisites 

To help you follow the examples I present, I created 

a sample Employee table for you to use. ГЇЇ assume 

you have a database to work with and the permissions 

needed to create and modify tables in it. 
To create the sample Employee table, follow these 

Steps: 

1. Download the CodeToCreateEmployeeTable.sql 
and CodeToPopulateEmployeeTable.sql files. Go to 
www.sqlmag.com, enter 98315 in the InstantDoc ID 
text box, and click the 98315.zip hotlink. 

2. Create the Employee table. Open SQL Server 2005’s 
SQL Server Management Studio (SSMS) or SQL 
Server 2000's Query Analyzer and copy the code in 
CodeToCreateEmployeeTable.sql, which Listing 1 
shows, in the query window. In the code at callout 
A in Listing 1, change MyDB to the name of your 
database. Execute the code. 

3.Populate the Employee table. To do this, run 
CodeToPopulateEmployeeTablesql. As Listing 2 
shows, this code uses single-record INSERT state- 
ments to add the fictitious employee data. (See 
“T-SQL 101, Lesson 2,” April 2008, InstantDoc ID 
98105, for information about this type of INSERT 
statement.) 


After you've created and populated the Employee 
table, take a minute to familiarize yourself with the 
table's layout. 
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Counting Records 

You can use the COUNT function in a SELECT state- 
ment to obtain the number of items in a group. You 
enclose the item you want to count in parentheses. The 
item can be just about any expression (i.e., column 
name, function, constant value, or any combination 
thereof), but usually the item is a single column in 
a table. You can specify an asterisk (*) if you want 


Bill McEvoy 


to count all the records in the table. For example, if 


you want to determine how many records are in the 
Employee table, you'd run the code 


SELECT COUNT(*) 
AS 'Employees' 
FROM Employee 


In this statement, the AS ‘Employees’ clause 
specifies that you want the results displayed 


under the column 
name of Employees. 
(See “T-SQL 101, 
Lesson 1,” March 2008, 
InstantDoc ID 97724, 
for information about 
using the AS clause to 
display different column 
names in result sets.) 


bill @ cookingwithsql.com) is the Master Chef/ 
DBA for Cooking With SQL. Having been a DBA 


since SQL Server 42, he specializes in batch 


processing and performance tuning. 


ORE on the WEB 


Download the code at 
InstantDoc ID 98315. 


LISTING |: Code to Create the 
Employee Table 


QA)use MyDB 
60 


CREATE TABLE Employee 
(EmployeeID INT IDENTITY (1,1) PRIMARY KEY, 
FirstName VARCHAR(15), 
LastName VARCHAR(15), 
Salary INT, 
HireDate Datetime NOT NUL) 
60 


LISTING 2: Code to Populate the Employee Table 


INSERT INTO Employee (FirstName, LastName, Salary, HireDate) 
VALUES ('William', 'McEvoy', 250000,'1990-01-01') 

INSERT INTO Employee (FirstName, LastName, Salary, HireDate) 
VALUES ('Garret', 'Testerson', 100000, '1997-04-01') 

INSERT INTO Employee (FirstName, LastName, Salary, HireDate) 
VALUES ('Raoule', 'Teteblanche', 95000, '2000-07-01') 

INSERT INTO Employee (FirstName, LastName, Salary, HireDate) 
VALUES ('Garth', 'Vader', 80000, '2002-03-13') 

INSERT INTO Employee (FirstName, LastName, Salary, HireDate) 
VALUES ('Bill', 'Diamond', 65000, '2005-12-15') 

INSERT INTO Employee (FirstName, LastName, Salary, HireDate) 
VALUES ('Napolean', 'Lawrence', 23500, '2006-03-15') 

INSERT INTO Employee (FirstName, LastName, Salary, HireDate) 
VALUES ('Michael', 'Smith', 45000, '2006-03-15') 

INSERT INTO Employee (FirstName, LastName, Salary, HireDate) 
VALUES ('John', 'Smith', 66000, '2006-03-15') 
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The FROM Employee clause specifies the target table. 
As Figure | shows, the results show that the Employee 
table has 8 records. 

Suppose that you only want to count the number 
of employees that make under $30,000 a year. You can 
add a WHERE clause that specifies the value in the 
Salary column must be less than $30,000: 


SELECT COUNT(*) 
AS 'Impoverished' 
FROM Employee 
WHERE Salary « 30000 


In this case, the result is 1, which is displayed under the 
column name of Impoverished. 

When you specify *, the COUNT function counts 

all rows, even if columns within the rows contain 

NULL values (ie, entries that have 


Employees 


no explicitly assigned value). If you 
specify a column name, however, only 
non-NULL values are counted. When 


Figure | 


Results from using the 
COUNT function to 
count all records in a 
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you use the DISTINCT keyword, the 
number of unique non-NULL values is 
determined. The expression specified must be a column 
name and not an arithmetic expression. For example, 
you can use the DISTINCT keyword to determine how 
many unique last names are in the LastName column 
in the Employee table: 


SELECT COUNTCDISTINCT LastName) 
AS 'Last Names' 
FROM Employee 


The result is 7 because two of the employees have the 
same last name (1.е., Smith), as callout A in Listing 2 
shows. 

Note that at the beginning of this section, I men- 
tioned you can count just about any expression. I 
said just about because you can’t use expressions of 
type uniqueidentifier, text, ntext, or image. If you're 
unfamiliar with these data types, see the Data Types 


( Transact-SQL) Web page at msdn2.microsoft.com/ 
en-us/library/ms187752.aspx. 


Determining Minimum Values 
When you use the MIN function in a SELECT state- 
ment, you can find out the minimum value for a speci- 
fied column or arithmetic expression, which you enclose 
in parentheses. For example, to determine the lowest 
salary in the Employee table, you can use the query 


SELECT MIN(Salary) 
AS 'Minimum Salary' 
FROM Employee 


The result is $23,500. Although you can use the DIS- 
TINCT keyword with the MIN function, there's no 
point. By the function's very definition, there can be 
only one value. 


Unlike the COUNT function, the MIN function 
always ignores NULL values. (Similarly, the MAX, 
AVG, and SUM functions always ignore NULL 
values.) Like the COUNT function, the MIN function 
is quite versatile in that you can include just about any 
expression to specify the item for which you want to 
find the minimum value. For example, you can use an 
expression that contains the DATEDIFF and GET- 
DATE functions to determine the number of months 
that the most-recent hired employee has been with our 
fictitious company. First, you use DATEDIFF and 
GETDATE to calculate the number of months that 
have elapsed since each employee was hired. Then, you 
use MIN to determine the lowest number out of all the 
month values just calculated. 

The DATEDIFF function takes three parameters. 
You use the first parameter to specify the time period 
being tracked. In this case, you need to specify m for 
months. You use the second and third parameters to 
specify the start and end dates, respectively. In this 
case, the start date is the value in the Employee table’s 
HireDate column and the end date is the current date, 
which you obtain with the GETDATE function. So, 
the query is 


SELECT MIN(DateDiff 
(m, HireDate, GETDATE())) 
AS 'Number of Months' 
FROM Employee 


If you run this query on, say, May 20, 2008, the result 
is 26 months. 

To learn more about the MIN, DATEDIFF, GET- 
DATE, or any of the other functions discussed here, 
highlight the function in your query window and press 
Shift+F1. This will invoke SQL Server Books Online 
(BOL) context-sensitive help, which will bring you to 
the appropriate documentation. 


Determining Maximum Values 
You can use the MAX function ш а SELECT statement 
to obtain the maximum value for a specified column or 
arithmetic expression. For example, the query 


SELECT MAX(Salary) 
AS 'Maximum Salary' 
FROM Employee 


reveals that the highest salary in the Employee table 
is $250,000. 

Using the DATEDIFF and GETDATE functions, 
you can determine how many years the most senior 
employee has been with the fictitious company. The 
code is similar to that used to obtain how long the most 
recently hired employee has been with the company, 
except that MAX rather than MIN is used and the 
time-period argument is in years (represented by yy) 
rather than months. So, the query 


SQL Server Magazine * www.sqlmag.com 


SELECT MAXCDATEDIFF 
(yy, HireDate, GETDATE())) 
AS "Number of Years' 
FROM Employee 


reveals that the most senior employee has been with the 
company for 18 years. 


Determining Average Values 

You can use the AVG function ina SELECT statement 
to obtain the average value for a specified column or 
arithmetic expression. (The values you're averaging 
must be numeric.) For example, to determine the 
average number of years employees have been with the 
company, you'd run the query 


SELECT AVGCDATEDIFF 
(yy, HireDate, GETDATE())) 
AS ‘Average Years of Service' 
FROM Employee 


The result is an average of 5 years. Note that the data 
type returned by AVG is determined by the type of 
data being evaluated. In this case, the Salary field is 
the data type of integer, so the data returned is an 
integer. 

Let’s look at another example for AVG. Sup- 
pose you want to find the average salary of all the 
employees. You'd run the query 


SELECT AVG(Salary) 
AS 'AverageSalary' 
FROM Employee 


The result is $90,562. You might be tempted to say that 
our fictitious company pays a decent average salary, 
but looking at the minimum ($23,500) and maximum 
($250,000) salaries reveals that there's quite a gap 
between the two. In such cases, you can use the MIN 
and MAX functions in a WHERE clause to filter out 
the top and bottom salaries: 


SELECT AVG(Salary) 
AS 'AdjustedAverageSalary' 
FROM Employee 
WHERE Salary « (SELECT MAX(Salary) 
FROM Employee) 
AND Salary » (SELECT MIN(Salary) 
FROM Employee) 


As you can see, the WHERE clause consists of 
two components The first component—Salary < 
(SELECT MAX(Salary) FROM Employee)—uses 
МАХ to find the maximum salary, then uses the < 
(less than) operator to select the salaries lower than that 
value. The second component—Salary > (SELECT 
MIN(Salary) FROM Employee)—uses MIN to find 
the minimum salary, then uses the > (greater than) 
operator to select the salaries higher than that value. By 
joining these two components with the AND operator, 


SQL Server Magazine * www.sqlmag.com 


T-SQL 101 


the WHERE statement 
retrieves only those 
values that are lower 
than the maximum 
salary and higher than 
the minimum salary. 
The AVG function then 
uses those values to 
determine the average 
salary. In this case, the 
result is $75,166, which is quite a bit lower than previ- 
ously reported $90,562. (See Lesson 1 for background 
information on how to use the AND, <, and > opera- 
tors in WHERE clauses.) 


Determining the Sum of Values 
When you use the SUM function in a SELECT state- 
ment, you can obtain the total of all the values in 
the specified column or arithmetic expression. (The 
values being totaled must be numeric.) For example, 
the query 


SELECT SUM(Salary) 
AS 'Salary Total' 
FROM Employee 


shows that the sum of all the salaries in the Employee 
table is $724,500. Like the AVG function, the data type 
returned by the SUM function is determined by the 
type of data being evaluated. 

With the sum of all the salaries in hand, you 
might be tempted to divide that total by the number 
of records returned by COUNT(*) to get an average 
salary. However, this practice can lead to problems 
because COUNT(*) and SUM handle NULL values 
differently. As I mentioned previously, COUNT(*) 
includes NULL values whereas SUM ignores them. 
So, for example, if the HR department didn’t get a 
chance to enter all the salary data for employees, using 
the SUM and COUNT(*) functions to calculate the 
average salary would lead to an inaccurate calculation 
because you'd be totaling only the available salary data 
and dividing that total by all the employee records. 
Thus, it's safer to use the AVG function instead because 
it ignores NULL values altogether. 


Beyond the Aggregate Basics 
In this lesson, I showed you examples of how to use 
the COUNT, MIN, MAX, AVG, and SUM functions 
in SELECT queries. Although these queries are useful 
from a teaching perspective, they only scratch the sur- 
face when it comes to showing how useful aggregate 
functions can be when they're used in more complex 
queries. In Lesson 4, ГЇЇ show you how to tap into the 
true power of T-SQL by using aggregate functions with 
the Group By clause in SELECT statements. ЕЙ 
InstantDoc 10 98315 


LEARNING PATH 
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To read the previous T-SQL 101 lessons, go to 
“T-SQL 101, Lesson 1,” InstantDoc ID 97724 
“T-SQL 101, Lesson 2,” InstantDoc ID 98105 
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—INFRASTRUCTURE LOG 


-DAY 94: Finding critical customer information is impossible. 
We can't find the data we need, when we need it. How can we 
put our info to good use if it's not at our fingertips? 


_Gil installed a transporter. He says he can instantly beam 
people to data...He also says marketing is stuck in hyperspace. 


_DAY 97: I'm on a new mission using an IBM InfoSphere™ Master 
Data Management Server and IBM Global Business Services. 

Now we have real-time access to the customer information we 
need in a single view. We can deliver trusted info to the 
people, processes and apps that need it—regardless of location 
or data type. And we can use it to drive better business results. 


_P.S. Marketing is back, but I might have “accidentally” 
beamed Gil to Mexico City. jLo siento, mi amigo! 


Watch the Master Data Management demo at: 


IBM.COM/TAKEBACKCONTROL/DATA 
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May 29, 2008—11:00 AM, EDT 


WHERE 


On your computer 


COST 


$99/registrant for I, 2, or all З live online sessions, 
and includes access to the archived versions 


SESSIONS 

m Mailbox High availability in Exchange 2007: Learn 
the Pros and Cons of Your High-Availability Options 

B Transport Rules: See Real-World Examples You Can 
Implement in Your Environment 

B PowerShell: Get Started with Basic Commands —You 
Don't Need a PhD in Rocket Science 


RESERVE A SEAT hy going to: 


www.windowsitpro.com/go/elearning/ 
MasteringExchange2007 


SPEAKER 

Mark Arnold 

MCSE+M, Microsoft MVP 

Mark Arnold is a senior technical 

dall architect for Anix, a UK-based stor- 

age integrator, where he solves stor- 

h age and compliance problems for his 
clients by using Microsoft Exchange 

as a key component in SAN and NAS 

deployments. He’s also a regular contributor to 

Microsoft’s “Industry Insiders” TechNet program and 

is active on Exchange newsgroups and forums. 


ABOUT THE SESSIONS 

Mailbox High Availability in Exchange 2007: Learn the Pros and Cons 
of Your High-Availability Options 

Exchange 2007 now has several acronyms for high availability—LCR, 
CCR, SCR—not including anything you can do with your storage or CDP 
solutions. Which method is best for you? How can you implement a mix 
of options to make your environment highly available at a price point 
that doesn’t break the bank? 


Transport Rules—Real-World Examples That Can Help Your 
Environment 

What are transport rules and how do they help you administer your 
Exchange environment? You can find many complicated—and largely 
useless—examples on the Internet. We'll show some interesting things 
you can do with message flow and give you real-world examples. 


PowerShell—The Things You Need That Don’t Involve a PhD in Brain 
Surgery or Rocket Science 

Some Exchange admins resist PowerShell because they think they can 
complete tasks quicker through the GUI. But we'll present some useful, 
quick, and readily repeatable PowerShell commands that will make your 
job easier rather than your hair grayer. 


REGISTER TODAY — seats are limited, 
to allow lots of live Q&A at the end. 


For more information, or to register, go to: 
WWW.Windowsitpro.com/go/elearning/masteringexchange2007 
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SQL Server Change 
Management Tools 


Products from Embarcadero Technologies and 
Quest Software can help maintain your production 
environment or migrate you to a different platform 


Editor’s Note: This is a summarized version of John 
Green's review. To read the full-length version of the 
article, go to www.sqlmag.com and enter InstantDoc 
ID 98505. 


W ете probably all familiar with the importance 
of change management—without a way to 
track and manage the updates we make, we're seri- 
ously handicapped when something unexpected occurs. 
Having an activity trail and known good versions of 
a system can make recovery much easier. Database 
change-management solutions apply the same con- 
cepts to the development and maintenance efforts 
of our database systems. Embarcadero Technologies’ 
Embarcadero Change Manager and Quest Software's 
Change Director for SQL Server both manage changes 
to our database definitions, but there are significant 
differences between the products. 


Change Manager Architecture 
Embarcadero Change Manager 4.0 supports Microsoft 
SQL Server 2005 (though not all features) and SQL 
Server 2000, as well as Oracle, Sybase, and DB2. On- 
demand and scheduled jobs compare database data 
and schema either within the same database platform 
or between platforms, making this product useful for 
platform migration projects. You can also manage and 
monitor database server configuration parameters so 
you'll know if something changes that might affect 
database performance. 

Change Manager comprises three key components: 
CM/Config and CM/Data, which share a common 
setup program and UI; and CM/Schema, which is 
installed and managed separately. Change Manager 
uses standard database interfaces to query database 
servers for data, database server configuration param- 
eters, and schema information, so no agent or stored 
procedures need to be installed on participating SQL 
Server instances. 

A PDF document, “Change Manager 4.0 Evalua- 
tion Guide,” walks you through the initial installation 
and use of CM/Config and CM/Data; another PDF, 
“Using Change Manager 4.0,” describes the feature 
set and use of all three Change Manager components. 
Online Help, in .chm format, is also available. 
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Change Manager installs on 32-bit versions of 
Windows XP Pro, Windows Server 2003, and Win- 
dows 2000, and it runs on both x86 and x64 versions 
of Windows Vista. I installed Change Manager on a 
Windows 2003 system, and the setup routines com- 
pleted quickly and uneventfully. 


Working with Change Manager 
As Figure 1, page 42, shows, the GUI for CM/Data 
and CM/Config features a tabbed interface on the left 
that displays alternative views of configured objects— 
archives, datasources, and the jobs defined for them. As 
the first step to using Change Manager, a wizard helps 
you define datasources. 

Change Manager’s Data Comparison Job Editor 
gives you some control over how the automated map- 
ping and comparison between databases, tables, and 
columns occurs. A Mapping tab gives you full control 
over which databases, tables, and columns participate 
in the comparison. 

With the comparison definition complete, you can 
run it and work with the results, or save the job for 
later use. A right-click menu from the list of saved jobs 
creates and saves a batch file (-bat) you can use to run 
or schedule the job 
using an external job 
scheduler. After you 
run the comparison, 
you'll see a Results 
tab next to the Map- 


John Green 


john @ nereus.cc) is president of Nereus 


Computer Consulting. 


CLD EMBARCADERO 


CHANGE MANAGER 4.0 


ping tab; it provides 
an overview of the 
outcome. Selecting a 
View option from the 
results summary line 
for a database shows 
a Database Results 
tab with detailed dif- 
ferences in the data 
for each table and 
row. From the row- 
level results display, 
you can select sets of 
rows and synchronize 
them in either direc- 


Pros: Multiplatform support suitable for database 
migration projects; very easy to use; data compari- 
son feature is flexible and includes two-way syn- 
chronization between the compared datasources 


Cons: CM/Schema GUI is separate from CM/Data 
and CM/Config; worked with SQL Server 2005 in 
my tests, but isn't fully supported for all features 


Rating: XX Xr 


Price: starts at $1,795 


Recommendation: Change Manager is easy to 
use and has a great feature set—l recommend 
you give it a try! 


Contact: Embarcadero Technologies e 
415-834-3131 * www.embarcadero.com 
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Figure | 


Change Manager's UI for 
CM/Data and CM/Config 
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tion—making the target data match the source, the 
source data match the target, or some of both. 

When working with server configuration param- 
eters, Change Manager lets you create an archive to 
store an instance’s current settings. You can also create 
a standard, which either contains the values saved in 
an archive, or is linked to a single datasource, taking 
on the current configuration values of that datasource 
when it’s used. Configuration comparison jobs, the real 
power of CM/Config, let you compare either a single 
datasource or a single standard with one or more data- 
sources or archives, with a result of either pass or fail. 

CM/Schema has a GUI separate from CM/Data 
and CM/Config; it provides icons for rapid access to 
CM/Schema's many wizard-driven procedures. CM/ 
Schema archives all or part of the set of schemas 
found on a datasource, compares a datasource to a 
live or archived schema, and synchronizes (pushes 
out) an archived schema to other datasources. Its 
cross-platform support can help with database migra- 
tion projects. The first time I started CM/Schema, it 
revealed another nice feature: It autodiscovered and 
registered SQL Server instances on my network. 

CM/Schema offers several ways to select the objects 
you want to capture with a schema capture job, letting 
you filter schema objects on the datasource by database 
(in SQL Server), followed by object type and optionally 
by object owner. You can then deselect certain schema 
objects to achieve the desired scope. In addition to 
archiving the schema, CM/Schema creates Data Defi- 
nition Language (DDL) statements and reports the 
job’s results via email or Net Send. 

When you run a comparison job from the GUI 
rather than as a scheduled task, CM/Schema provides a 
Difference Analysis window. In side-by-side panes, the 
window displays the DDL needed to create the objects 
as they exist on the source and target datasources, 


highlighting the differences. Selecting one or more 
objects on the comparison report and then choosing 
Synchronize from the right-click menu directs CM/ 
Schema to generate DDL to synchronize the target for 
the selected objects. 


Change Manager Assessment 

I was pleased with Change Manager's design and 
capabilities. The separation of CM/Schema from the 
other components was overshadowed by some of 
Change Managers nice features—such as the ability to 
work with multiple databases on a single server in the 
same job. Its support for several database platforms, 
including the ability to synchronize from one platform 
to another, is useful for database migration projects. 
The Difference Analysis window with its side by side 
comparisons is a useful and clever visual aid. 

Change Manager is particularly well suited for 
environments with ongoing application development 
and migration projects. I recommend that you give it 
a try to see how it would support your implementa- 
tion projects, safeguard your database definitions, and 
expedite rapid recovery from problems. 


Change Director Architecture 
Key features of Quest Change Director for SQL 
Server 1.5 include database schema versioning, object 
comparison, and scheduled or real-time deployment 
of schema changes to multiple SQL Server instances 
across the enterprise. Change Director runs on Win- 
dows Vista (including x64 versions), Windows XP, 
Windows 2003, and Windows 2000; it requires Micro- 
soft .NET Framework 2.0. Change Director supports 
SQL Server 2005 and SQL Server 2000; it doesn't sup- 
port any other database platforms. 

Change Director comprises four key components: 
Database Browser, Log Reader, Change Tracker, and 
Job Scheduler. It requires access to an instance of SQL 
Server where it places the Change Director Repository, 
which stores configuration information, schema versions 
(called snapshots), and the server and database object 
definitions collected by Change Tracker. Documentation 
isa significant weakness of the product. Change Director 
documentation is available only online in .chm format, 
and it appeared fairly superficial to me. In most cases, 
it describes features only in general terms, and it doesn't 
include any screen shots. 

I installed Change Director on a Windows 2003 
system after first installing .NET Framework 2.0. 
Next, guided by Change Director’s wizards, I created 
the repository on a SQL Server instance and registered 
the first monitored SQL Server instance. You can also 
install the Change Tracker and Log Reader compo- 
nents on each monitored server through the wizard. 
The registration wizard browses for SQL Server 
instances active in the domain or in other domains. 
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Working with Change Director 
Change Director’s four main functional components 
operate largely independent of one another. You use 
the Database Browser to compare live databases to 
one another, and it also creates synchronization and 
rollback scripts to deploy or retract schema changes 
for production databases. Through the Database 
Browser, you create either Compare projects or 
Custom Scripts projects. Compare projects let you 
compare one database within a SQL Server instance 
with one or more databases from the same or other 
instances. Custom Script projects let you run a 
custom deployment script against a target database. 

A third type of comparison job uses a database 
snapshot as its source. You can create a snapshot on 
demand from the right-click menu for a database or 
an instance of SQL Server. A snapshot wizard lets 
you select one or more databases within an instance 
and either record the snapshot on demand or schedule 
the snapshot job for one-time or recurring execution. 
You can view a graphical representation of the schema 
represented by the snapshot through the Snapshot 
Viewer; you can also view and save the DDL required 
to create the schema. 

After you define the sources and targets for a job, 
the next step is for Change Director to perform an 
impact analysis. When the analysis completes, clicking 
Display Impact brings up a screen listing the objects in 
the target database that need to be updated to synchro- 
nize it with the source, the deployment script that will 
implement the changes, and—most significantly—a 
list of potentially unintended consequences of the 
deployment. When you're satisfied that the job will 
make the correct changes to the target databases, you 
can schedule it for execution or run it immediately. 

In my testing, Database Browser jobs worked effec- 
tively. I made changes to a database table, and using 
the compare feature both with a snapshot and with 
the active database, I was able to deploy the changes 
to a target database. In another test, I deleted multiple 
objects from a database, including fields, triggers, and 
constraints. In this case, the impact assessment warned 
of multiple possible problems; not unexpectedly, the 
job failed to run to completion. 

Log Reader is Change Director's data restoration 
component. It lets you load and review log records 
from SQL Server memory and online transaction log 
files, offline transaction log files, and log files con- 
tained in backup datasets. I tested Log Reader with 
some common errors—a dropped table and an update 
issued without a WHERE clause. In the former case, I 
used Change Director's Recover Table Wizard, which 
quickly restored the table and its data. In the latter case, 
I used the Undo/Redo wizard. I needed only to select 
a single Update transaction; the wizard went to work 
quickly, restoring the data to its original state. 
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CHANGE DIRECTOR 
FOR SQL SERVER 1.5 


Pros: Change Director's Log Reader is an exceptionally easy-to-use tool to 
reverse unwanted database changes; Change Tracker effectively monitors and 


notifies you of unwanted changes with real-time alerts 


Cons: Weak documentation is an obstacle to effectively understanding and 
using Change Director; no default global email server setting for notifications 


Rating: ЖЖ УСУ 


Price: $995 per server host 


Recommendation: Change Tracker and Log Reader are particularly useful in 
production application environments—if that characterizes your needs, | sug- 


gest you take Change Director for a test drive. 


Contact: Quest Software ® 800-306-9329 ® www.quest.com 
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Change Director's Change Tracker component uses 
an agent to monitor configured SQL Server instances 
for changes to server and database objects, and for 
failed logins. As Figure 2 shows, when you select 
Change Tracker in the Change Director GUI, you'll see 
a hierarchical display of groups, SQL Server instances, 
and databases on the left and a tabbed work area on 
the right. 

By default, all monitoring and alerting functions 
are disabled upon installation of the Change Tracker 
agent. You start by providing Change Tracker with the 
SMTP server and sender email address that Change 
Tracker will use for email notification. It would be nice 
to have the option to use a default global setting instead 
of having to enter this information for each SQL Server 
instance. Next, turn on monitoring for selected databases 
by selecting Start Monitoring from the right-click menu 
for each database or from its Configuration tab in the 
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Figure 2 


Change Director's Ul 
showing the Change 
Tracker component 
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work area. When configuring instance or database moni- 
toring options, you also have the opportunity to define 
Operators—a global object that can include a name, an 
email address, and a Net Send destination. 

Configuring monitoring, alerting, and reporting 
options is easy. You select objects to monitor and 
alerts to send by clicking a check box next to the object 
name. You configure notification by clicking a check 
box next to the listed Operator names to enable Net 
Send notification, and by choosing either Summary 
or Detailed email notification to override the default 
Disabled option. 

For my testing, I used SQL Server Management 
Studio to change a server configuration parameter 
using sp_configure and to delete a trigger, a constraint, 
anda field from a monitored database. The events were 
immediately displayed in Change Tracker’s Overview 
tab for the group, instance, and database, and the email 
notifications showed up shortly thereafter. 


Change Director Assessment 

There are a few things that I thought could be added 
or improved upon with Change Director. The first 
is the documentation, as I mentioned. Next, Change 
Director supports using a snapshot as the source object 
in a comparison job, but this capability isn’t integrated 


with the standard Create New Compare project—you 
have to start from the list of snapshots. When reviewing 
options for object mapping, I expected to find lists of 
tables, columns, keys, and similar objects for the des- 
ignated source and target databases; instead, I found 
an option to map objects to alternate filegroups on the 
target database, but no option to map other objects. 

І really liked the Log Reader component of Change 
Director; it was easy to use and an effective way to roll 
back unintended changes to databases. Change Tracker is 
a useful monitoring tool, providing real-time notification 
of unwanted database changes. These two features can be 
invaluable in support of production applications—f that 
characterizes your operational environment, I recom- 
mend you give Change Director a try. 


Picking the Best 
Change Director's real-time alerts and Log Reader 
features are great for production application environ- 
ments. But in the end, I found Change Manager to be 
the better product and award it my Editor's Choice. 
Change Manager's multiplatform support, data com- 
parison, and synchronization features make it particu- 
larly well suited for users with ongoing development 
and migration projects. SQL] 
InstantDoc ID 98505 
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ApexSQL Enforce 


Rapid, repeatable and impartial 
Database reviews 


У Enforce company standards and conventions 
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у^ Ensure databases meet industry “Best Practices" 
v/ Produce detailed reports of violations and results in HTML 
У Schedule nightly, unattended reviews via CLI 


v/ Use our large database of rules or write your own 


eee eee 266 


У Detect and fix database problems automatically 


У Set customizable thresholds and tolerances 


ApexSQL www.apexsgl.com 


software orphone 866-665-5500 
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v/ Edit rules in a powerful IDE with Intelliprompt 
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MICROSOFT ACCESS Q Editors Tip 
Scale Forms to Match Screen Resolution 


Developer Markus Gruber has announced the release of modScaleForm, a new freeware database utility that Got a great П 


allows DBAs to easily modify Microsoft Access forms to match available screen resolutions. The program provides — new product? 
control over font elements, as well as a form-scaling ability that can make forms created in Access—which may Send announce- 
be used as a front-end for SQL Server databases—align properly in different client screen resolutions. For more ments to products@ 
information about modScaleForm or to download a copy, go to scourceforge.net/projects/modscaleform. 


тарот _ 
BUSINESS Јен James, 
INTELLIGENCE senior editor 
Data Analytics for Large 
Data Sets 


Managing and providing analytics 
for large amounts of data is the 
focus of Vertica Analytics Database 
2.0. According to Vertica, this new 
release helps make enterprise data 
warehouses more efficient by using 
data compression and providing 
more responsive data analysis. Ver- 
tica also claims that their database 
system features a shared-nothing, 

— A column architecture that supports 
concurrent mm айй querying rof data that improved pecunias even further. For more information, contact 
Vertica at 978-533-3500 or visit www.vertica.com. 


BUSINESS PROCESS MODELING TOOLS 
Enterprise Architecture Modeling — —MSmGàÓ — чу 
EA/Studio Community Edition is a new freeware version of Embar- CREDITI rissa ee сазана ва 

cadero’s EA/Studio business process management (BPM) tool. The 
software allows database architects to more easily model enterprise 
database architecture, particularly with regards to auditing and compli- 
ance needs. It can also import Visio model diagrams, and created data 
models can be shared with Embarcadero’s ER/Studio data modeling 
application. This product is also the third software release from Embar- 
cadero that was developed using Eclipse, an open-source development 
framework that emphasizes interoperability between other products 
developed with the Eclipse framework. For more information, call 
Embarcadero at 415-834-3131, or visit www.embarcadero.com to 
download a copy. 


DATABASE SECURITY 

Log and Event Management 

LogRhythm has announced LogRhythm 4.0, the latest version of 

their database log and event management utility. This new release gathers and analyzes logs generated by OBDC 

databases such as SQL Server, MySQL, Oracle, and IBM DB2. Logs can be analyzed remotely, and alerts can 

be configured to notify administrators when there is unauthorized access or questionable activity in the database 

auditing trail. LogRhythm 4.0 also introduces a new Log Miner tool that aggregates insider threat and anomaly 

detection information into a single, graphical control panel. Pricing for LogRythm 4.0 begins at $20,000. For 

more information, contact LogRythym at 303-413-8745 or visit www.logrhythm.com. [SOLI 
InstantDoc ID 98370 


SQL Server Magazine * www.sqlmag.com May 2008 45 


Did You Know... 


Along with windowsitpro.com and salmag.com two new sites 
have been launched to ensure custom-made content is just a click away. 
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Windows "РР 


Engage with our network of peers and professionals and view various forms of content. 
It is a complete source for IT Professionals and managers. 
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AddressObject API 


Verify, cleanse and 
format customer data 
at the point of entry 
or in batch. Easily 
integrate with .Net, 
MS SQL or Java. 


Data Submitted 
22342 emprisa ste 100 92688 


Address Data Returned 
22342 Avenida Empresa Ste 100 
Rancho Santa Margarita, CA 
92688-2156 
Carrier Route: C056 


CountyName: Orange 
TimeZone: Pacific 

Delivery Indicator: Business 
Suite Status: Validated 
Latitude: 33.6480 

Longitude: -117.6000 


(and more) 
Get a FREE demo at: 
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lient network libraries are vital in SQL Server, 

enabling the communications link between 
client applications and the SQL Server system. The 
client must use the same client network library as 
the SQL Server system that the client needs to con- 
nect to. When the client connects over a LAN or 
WAN link, the client network library encapsulates 
SQL Server’s Tabular Data Stream (TDS) within 
the appropriate network protocol. (TDS is the 
protocol which SQL Server uses to accept network 
query requests and return results to client applica- 
tions). For a local connection, a high-performance 
shared-memory client network library can be used. 
To work with client network libraries, open SQL 
Server Configuration Manager, click SQL Native 
Client Configuration, then the Client Protocols 
node. Here are SQL Server 2005’s client network 
libraries. 


Michael Otey 


motey @ {ітар сот) is technical director 
for Windows IT Pro and SQL Server Magazine 


and coauthor of SQL Server 2005 Developers 
Guide (Osborne/McGraw-Hill). 


Shared Memory 

The Shared Memory client network library is used 
to connect applications running on the local server 
and the SQL Server engine. Used by default by 
SQL Server Express, it provides the fastest connec- 
tion to SQL Server, bypassing the system’s network 
stack to communicate directly using an in-memory 
pipe. It has no configuration options and is used by 
default when you name your SQL Server system 
using local (e.g., (local)\SQLExpress). 
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” InstantDoc ID 97959 


“PowerShell 101, Lesson 3: How to use PowerShell’s 
operators and wildcards,” InstantDoc ID 98177 


If you have questions about PowerShell or any other 
topic—or if you know when the next party is—contact 
me at Christan.Humphries@penton.com. 


Fa SOL Server 2005 
Client Network Libraries 


Virtual Interface Adapter 

The Virtual Interface Adapter (VIA) protocol is used for 
a high-performance dedicated link between two systems. 
VIA provides a memory-mapped communication model, 
which bypasses the OS networking layers for optimum 
performance. For SQL Server, the VIA client network 
library is typically used when you want to implement high 
performance clusters. By default, it uses port 1433, but this 
setting is configurable. 


Named Pipes 

This client network library is best suited for LAN connec- 
tions. It can be used over TCP/IP and NetBEUI network 
protocols. Over a LAN link, the performance is comparable 
to the TCP/IP client network library. By default, SQL Server 
listens on named pipe \\,\pipe\sql\query for client connec- 
tions, but the default pipe can be changed. The named pipes 
connection is used by default when you name your SQL 
Server system using a period. (e.g., .\SQLExpress) 


TCP/IP 
The commonly used TCP/IP client network library works 
on local, LAN, and WAN connections but is best suited 
to LAN or WAN links. It performs better over WAN links 
than the chattier Named Pipes protocol. True to its name, 
this client network library must be used over the TCP/IP 
protocol. By default, it uses port 1433, but this setting is 
configurable. SQL] 
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Event View 


Unprecedented Visibility and Control 


SQL Sentry Event Manager is the ultimate scheduling, alerting and response system for optimizing schedule performance of 
database servers across your enterprise. With Event Manager there are no agents to install and manage on each server. SOL 
Sentry Event Manager provides a visual display of SQL Agent jobs, Oracle DBMS Jobs, Oracle Enterprise Manager Jobs, Oracle 10g 
Scheduler, and Windows Tasks along with other events across the enterprise so that the DBA can "see" how they relate to one 
another and optimize the schedules more efficiently. 
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Over 70,000 can't be wrong. That's how many SOL Servers are monitored worldwide by Idera products. 


NEW SOL diagnostic manager version 5.0 is a powerful * Monitor and manage SOL Servers enterprise-wide 
solution that helps you monitor, diagnose, and analyze SOL * Find and fix performance bottlenecks 

Server performance across all SOL Servers in your environment— * Set customized alerts and notifications 

from a central console. And it proactively alerts you if a health, * No agents or database objects required 
performance, or availability problem is detected. 
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